Streaming data from NASA's Earth Surface Minteral Dust Source Investigation (EMIT)¶
This is a proof of concept notebook to demonstrate how earthaccess can facilitate the use of cloud hosted data from NASA using xarray and holoviews. For a formal tutorial on EMIT please visit the official repository where things are explained in detail. EMIT Science Tutorial
Prerequisites
- NASA EDL credentials
- Openscapes Conda environment installed
- For direct access this notebook should run in AWS
IMPORTANT: This notebook should run out of AWS but is not recomended as streaming HDF5 data is slow out of region
import earthaccess
import numpy as np
import xarray as xr
import h5netcdf
from pprint import pprint
print(f"using earthaccess version {earthaccess.__version__}")
auth = earthaccess.login()
using earthaccess version 0.9.0
Searching for the dataset with .search_datasets()
¶
Note: API docs can be found at earthaccess
results = earthaccess.search_datasets(short_name = "EMITL2ARFL", cloud_hosted=True)
# Let's print our datasets
for dataset in results:
print(pprint(dataset.summary()))
Datasets found: 1 {'cloud-info': {'Region': 'us-west-2', 'S3BucketAndObjectPrefixNames': ['s3://lp-prod-protected/EMITL2ARFL.001', 's3://lp-prod-public/EMITL2ARFL.001'], 'S3CredentialsAPIDocumentationURL': 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME', 'S3CredentialsAPIEndpoint': 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials'}, 'concept-id': 'C2408750690-LPCLOUD', 'file-type': "[{'FormatType': 'Native', 'AverageFileSize': 1.8, 'Format': " "'netCDF-4', 'TotalCollectionFileSizeBeginDate': " "'2022-08-09T00:00:00.000Z', 'FormatDescription': 'Network " "Common Data Format Version 4', 'AverageFileSizeUnit': 'GB', " "'Media': ['Earthdata Cloud', 'HTTPS']}]", 'get-data': ['https://search.earthdata.nasa.gov/search/granules?p=C2408750690-LPCLOUD', 'https://appeears.earthdatacloud.nasa.gov/'], 'short-name': 'EMITL2ARFL', 'version': '001'} None
Searching for the data with .search_data()
over Ecuador¶
# ~Ecuador = -82.05,-3.17,-76.94,-0.52
granules = earthaccess.search_data(short_name="EMITL2ARFL",
bounding_box=(-82.05,-3.17,-76.94,-0.52),
count=10)
print(len(granules))
Granules found: 35 10
earthaccess
can print a preview of the data using the metadata from CMR¶
Note: there is a bug in earthaccess where the reported size of the granules are always 0, fix is coming next week
granules[7]
Data: EMIT_L2A_RFL_001_20230304T151234_2306310_003.ncEMIT_L2A_RFLUNCERT_001_20230304T151234_2306310_003.ncEMIT_L2A_MASK_001_20230304T151234_2306310_003.nc
Size: 3578.78 MB
Cloud Hosted: True
Streaming data from S3 with fsspec¶
Opening the data with earthaccess.open()
and accessing the NetCDF as if it was local
If we run this code in AWS(us-west-2), earthaccess can use direct S3 links. If we run it out of AWS, earthaccess can only use HTTPS links. Direct S3 access for NASA data is only allowed in region.
# open() accepts a list of results or a list of links
file_handlers = earthaccess.open(granules)
file_handlers
Opening 10 granules, approx size: 42.27 GB
[<File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230217T204403_2304813_036/EMIT_L2A_RFL_001_20230217T204403_2304813_036.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230217T204403_2304813_036/EMIT_L2A_RFLUNCERT_001_20230217T204403_2304813_036.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230217T204403_2304813_036/EMIT_L2A_MASK_001_20230217T204403_2304813_036.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230220T195555_2305113_042/EMIT_L2A_RFL_001_20230220T195555_2305113_042.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230220T195555_2305113_042/EMIT_L2A_RFLUNCERT_001_20230220T195555_2305113_042.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230220T195555_2305113_042/EMIT_L2A_MASK_001_20230220T195555_2305113_042.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230224T182202_2305512_038/EMIT_L2A_RFL_001_20230224T182202_2305512_038.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230224T182202_2305512_038/EMIT_L2A_RFLUNCERT_001_20230224T182202_2305512_038.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230224T182202_2305512_038/EMIT_L2A_MASK_001_20230224T182202_2305512_038.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230225T173354_2305611_028/EMIT_L2A_RFL_001_20230225T173354_2305611_028.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230225T173354_2305611_028/EMIT_L2A_RFLUNCERT_001_20230225T173354_2305611_028.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230225T173354_2305611_028/EMIT_L2A_MASK_001_20230225T173354_2305611_028.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230228T164727_2305911_024/EMIT_L2A_RFL_001_20230228T164727_2305911_024.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230228T164727_2305911_024/EMIT_L2A_RFLUNCERT_001_20230228T164727_2305911_024.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230228T164727_2305911_024/EMIT_L2A_MASK_001_20230228T164727_2305911_024.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230228T164739_2305911_025/EMIT_L2A_RFL_001_20230228T164739_2305911_025.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230228T164739_2305911_025/EMIT_L2A_RFLUNCERT_001_20230228T164739_2305911_025.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230228T164739_2305911_025/EMIT_L2A_MASK_001_20230228T164739_2305911_025.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230304T151222_2306310_002/EMIT_L2A_RFL_001_20230304T151222_2306310_002.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230304T151222_2306310_002/EMIT_L2A_RFLUNCERT_001_20230304T151222_2306310_002.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230304T151222_2306310_002/EMIT_L2A_MASK_001_20230304T151222_2306310_002.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230304T151234_2306310_003/EMIT_L2A_RFL_001_20230304T151234_2306310_003.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230304T151234_2306310_003/EMIT_L2A_RFLUNCERT_001_20230304T151234_2306310_003.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230304T151234_2306310_003/EMIT_L2A_MASK_001_20230304T151234_2306310_003.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230423T192457_2311313_035/EMIT_L2A_RFL_001_20230423T192457_2311313_035.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230423T192457_2311313_035/EMIT_L2A_RFLUNCERT_001_20230423T192457_2311313_035.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230423T192457_2311313_035/EMIT_L2A_MASK_001_20230423T192457_2311313_035.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230424T183609_2311412_034/EMIT_L2A_RFL_001_20230424T183609_2311412_034.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230424T183609_2311412_034/EMIT_L2A_RFLUNCERT_001_20230424T183609_2311412_034.nc>, <File-like object HTTPFileSystem, https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230424T183609_2311412_034/EMIT_L2A_MASK_001_20230424T183609_2311412_034.nc>]
%%time
# we can use any file from the array
file_p = file_handlers[4]
refl = xr.open_dataset(file_p)
wvl = xr.open_dataset(file_p, group='sensor_band_parameters')
loc = xr.open_dataset(file_p, group='location')
ds = xr.merge([refl, loc])
ds = ds.assign_coords({'downtrack':(['downtrack'], refl.downtrack.data),'crosstrack':(['crosstrack'],refl.crosstrack.data), **wvl.variables})
ds
CPU times: user 986 ms, sys: 202 ms, total: 1.19 s Wall time: 1min
<xarray.Dataset> Dimensions: (downtrack: 1280, crosstrack: 1242, bands: 285, ortho_y: 2035, ortho_x: 1801) Coordinates: * downtrack (downtrack) int64 0 1 2 3 4 ... 1276 1277 1278 1279 * crosstrack (crosstrack) int64 0 1 2 3 ... 1238 1239 1240 1241 wavelengths (bands) float32 ... fwhm (bands) float32 ... good_wavelengths (bands) float32 ... Dimensions without coordinates: bands, ortho_y, ortho_x Data variables: reflectance_uncertainty (downtrack, crosstrack, bands) float32 ... lon (downtrack, crosstrack) float64 ... lat (downtrack, crosstrack) float64 ... elev (downtrack, crosstrack) float64 ... glt_x (ortho_y, ortho_x) float64 ... glt_y (ortho_y, ortho_x) float64 ... Attributes: (12/38) ncei_template_version: NCEI_NetCDF_Swath_Template_v2.0 summary: The Earth Surface Mineral Dust Source ... keywords: Imaging Spectroscopy, minerals, EMIT, ... Conventions: CF-1.63 sensor: EMIT (Earth Surface Mineral Dust Sourc... instrument: EMIT ... ... southernmost_latitude: -4.024231286940166 spatialResolution: 0.000542232520256367 spatial_ref: GEOGCS["WGS 84",DATUM["WGS_1984",SPHER... geotransform: [-8.23355732e+01 5.42232520e-04 -0.00... day_night_flag: Day title: EMIT L2A Estimated Surface Reflectance...
Plotting non orthorectified data¶
Use the following code to plot the Panel widget when you run this code on AWS us-west-2
import hvplot.xarray
import panel as pn
import holoviews as hv
pn.extension()
b850 = np.nanargmin(abs(ds['wavelengths'].values-850)) # Find band nearest to value of 850 nm (NIR)
image = ds.sel(bands=b850).reflectance_uncertainty.hvplot('crosstrack', 'downtrack', cmap='viridis')
stream = hv.streams.Tap(source=image, x=255, y=484)
def wavelengths_histogram(x, y):
histo = ds['reflectance_uncertainty'].sel(crosstrack=x, downtrack=y, method='nearest').hvplot(x="wavelengths", color='green')
return histo
tap_dmap = hv.DynamicMap(wavelengths_histogram, streams=[stream])
pn.Column(image, tap_dmap)