stac-rs

High-performance, reliable STAC tooling with Rust

by Pete Gadomski

@gadomski

No Rust code in these slides

🤯

A bit about STAC

  • Spatio-Temporal Asset Catalog
  • Open community specification, based on GeoJSON
  • Core v1.0.0 released May 2021
  • API v1.0.0 released April 2023
STAC is a map to the data.
— Hobu

STAC entities

Development

STAC usage

Why stac-rs?

  • Software ecosystem diversity
  • Fill tooling gaps for
    • Servers (developers)
    • Consumers (data users)
  • Language binding support (e.g. Python, WASM)

Caveat

⚠️ Not used in production AFAIK

Installing


						$ cargo install stac-cli
					
or

						$ pip install stacrs-cli  # missing some features, e.g. GDAL
					

Then:


$ stacrs --help
					

Command line interface for stac-rs

Usage: stacrs [OPTIONS] 

Commands:
  item       Creates a STAC Item
  migrate    Migrates a STAC value from one version to another
  search     Searches a STAC API
  serve      Serves a STAC API
  sort       Sorts the fields of STAC object
  translate  Translates STAC values between formats
  validate   Validates a STAC object or API endpoint using json-schema validation
  help       Print this message or the help of the given subcommand(s)

Options:
  -c, --compact                        Use a compact representation of the output, if possible
  -i, --input-format     The input format
  -o, --output-format   The output format
  -h, --help                           Print help
  -V, --version                        Print version
					

For producers

Create


$ stacrs item an-id
{
	"type": "Feature",
	"stac_version": "1.0.0",
	"id": "an-id",
	"geometry": null,
	"properties": {
		"datetime": "2024-08-27T20:37:29.293151Z"
	},
	"links": [],
	"assets": {}
}
					

UNIX-y


$ stacrs item an-id > item.json && cat item.json
{
	"type": "Feature",
	"stac_version": "1.0.0",
	"id": "an-id",
	"geometry": null,
	"properties": {
		"datetime": "2024-08-27T20:37:29.293151Z"
	},
	"links": [],
	"assets": {}
}
					

$ stacrs item an-id item.json && cat item.json
{"type":"Feature","stac_version":"1.0.0","id":"an-id","geometry":null,"properties":{"datetime":"2024-09-06T16:48:10.156061Z"},"links":[],"assets":{}}                                        
					

$ stacrs item an-id | stacrs validate && echo "OK"
OK
					

Create (raster)


$ stacrs item \
  https://storage.googleapis.com/open-cogs/stac-examples/20201211_223832_CS2.tif
					

Create many


$ stacrs items images/*.tif > item-collection.json
					

Use other tools

Migrate


						$ stacrs migrate bands-v1.0.0.json --version 1.1.0-beta.1 | \
							 jq '.stac_version'   
						"1.1.0-beta.1"
					

Translate


$ stacrs translate \
	1000-sentinel-items.json \
	1000-sentinel-items.parquet
					

$ stacrs translate --geoparquet-compression snappy \
	1000-sentinel-items.json \
	1000-sentinel-items-snappy.parquet
					

							$ du -h 1000-sentinel-2-items* | sort -h
							7.0M    1000-sentinel-2-items.parquet
							 23M    1000-sentinel-2-items.json
					

							$ du -h 1000-sentinel-2-items* | sort -h
							1.1M    1000-sentinel-2-items-snappy.parquet
							7.0M    1000-sentinel-2-items.parquet
							 23M    1000-sentinel-2-items.json
					

Geoparquet

  • Stores geospatial vector data in Parquet
  • v1.1.0 release May 2024
  • geoarrow is a related specification

🤔 STAC items are geospatial vectors

stac-geoparquet

"Benchmarks"

Test stac-rs stac-geoparquet Speedup
json -> parquet 1.281 s ± 0.024 s 1.869 s ± 0.25 s 30%
parquet -> json 2.352 s ± 0.031 s 3.352 s ± 0.066 s 30%
  • 1000 Sentinel-2 items over Colorado
  • Tests are reading data from disk and writing it back in the other format
  • SNAPPY compression for parquet data

For servers

<rant/>

Serve


$ stacrs serve spec-examples/v1.0.0/collection.json
Serving a STAC API at http://127.0.0.1:7822 using a memory backend (loaded 1 collection and 3 items)
					

Serve items


	$ stacrs serve 1000-sentinel-items.json
	Serving a STAC API at http://127.0.0.1:7822 using a memory backend (loaded 1 collection and 1000 items)
						

Serve pgstac


$ stacrs serve --pgstac postgresql://username:password@localhost:5432/postgis
Serving a STAC API at http://127.0.0.1:7822 using a pgstac backend
						

"Benchmarks"

Test stac-rs stac-fastapi Speedup
One page of items 48.3 ms ± 3.2 ms 62.0 ms ± 2.2 ms 22%
Search everything 4.460 s ± 0.061 s 5.894 s ± 0.045 s 24%
  • 1000 Sentinel-2 items over Colorado
  • pgstac backend
  • localhost
  • Search was 10 items per page (so 100 requests)
  • ~1% of a single request is in pgstac

Consumers

Search

  stac-geoparquet


$ stacrs search https://planetarycomputer.microsoft.com/api/stac/v1 \
	-c sentinel-2-l2a --intersects @colorado.json \
	--max-items 1000 --sortby="-properties.datetime" \
	1000-sentinel-2-items.json
						

$ stacrs search https://planetarycomputer.microsoft.com/api/stac/v1 \
	-c sentinel-2-l2a --intersects @colorado.json \
	--max-items 1000 --sortby="-properties.datetime" \
	1000-sentinel-2-items.parquet
						

Search with DuckDB

⚠️ Experimental


$ cargo install stac-cli -F duckdb
						

Then:


$ stacrs search 1000-sentinel-2-items.parquet \
	--intersects @longmont.json
						

$ stacrs search \
	http://localhost:8080/1000-sentinel-2-items.parquet \
	--intersects @longmont.json
						

Python bindings


$ pip install stacrs
						

Then:


import stacrs

item = stacrs.migrate_href("item.json", version="1.1.0-beta.1")
assert item["stac_version"] == "1.1.0-beta.1"
						

stac-rs crates

Where next?

  • stac-geoparquet as an API response
  • object_store (coming soon!)
  • Moar Python bindings
  • ???

Fin

Thank you for your time