diff --git a/.changeset/v1.2.1.md b/.changeset/v1.2.1.md deleted file mode 100644 index 5964d32..0000000 --- a/.changeset/v1.2.1.md +++ /dev/null @@ -1,47 +0,0 @@ ---- -'@walkthru-earth/objex': patch -'@walkthru-earth/objex-utils': patch ---- - -v1.2.1 focuses on making authenticated reads from S3-compatible buckets actually work in the browser, and fixing a handful of smaller bugs surfaced along the way. No breaking changes. Both packages bump together via the changesets `fixed` config. - -### Authenticated S3-compatible reads (the headline fix) - -Before: `signed-s3` connections produced `s3://bucket/key` URLs. DuckDB-WASM's httpfs and the other fetchers signed each request with `Authorization: AWS4-HMAC-SHA256 ...`. The `Authorization` header triggers a CORS preflight, and the preflight is fragile on GCS, where `responseHeader` is dual-purpose (`Access-Control-Expose-Headers` AND `Access-Control-Allow-Headers`): any request header the browser sends that is not listed is silently dropped from the preflight response, the preflight returns 200 without `Access-Control-Allow-Origin`, and the browser blocks the real request. - -After: a new `presignHttpsUrl(conn, key, expiresIn?)` helper in `storage/presign.ts` uses `aws4fetch.signQuery` to return a presigned HTTPS URL with `X-Amz-Signature` in the query string. `buildHttpsUrlAsync` and `buildDuckDbUrlAsync` (new in `utils/url.ts`) surface it to callers, and `resolveTableSourceAsync` (new in `query/source.ts`) wires it into the table-source pipeline. DuckDB httpfs and every other range-request fetcher can now issue `GET` with only a `Range` header, keeping the preflight trivial. The 7-day expiry matches the SigV4 protocol maximum, the hard cap on every provider in the registry. - -Viewers migrated to await the async builders so their external fetchers receive a self-authenticating URL: `TableViewer` (via `resolveTableSourceAsync`, re-populates the editor only if the user has not edited the generated SQL during the await), `CogViewer`, `CopcViewer`, `ArchiveViewer`, `FlatGeobufViewer`, `PmtilesViewer`, `StacMapViewer`, `CodeViewer`, `ZarrViewer`, `ZarrMapViewer`. `PmtilesMapView` drops its unused sync import. - -`configureStorage(conn, connId, sourceRef?)` in `query/wasm.ts` now short-circuits the full `SET s3_access_key_id / secret / region / endpoint / url_style` block whenever the source ref points at a presigned HTTPS URL (`isHttpsSourceRef(ref)`). Every caller threads the ref or raw SQL through: schema / row-count / CRS probes pass `source.ref`, data-query paths (`query`, `queryForMap`, `queryCancellable`, `queryForMapCancellable`) pass the raw SQL (the regex matches `read_parquet('https://...')` embedded in SQL too). Net effect: one worker round-trip saved per query on every presigned tab, not just at tab open. - -Secondary fixes kept from the same workstream: - -- `configureStorage` falls back to `resolveProviderEndpoint()` when the connection's `endpoint` field is empty and the provider is not plain S3. Covers GCS, DO Spaces, Wasabi, B2, Storj, Contabo, Hetzner, Linode, OVHcloud, so auto-detected `?url=` connections that omit the endpoint still route DuckDB to the correct host on the `s3://` fallback path. -- `configureStorage` hardened against Svelte-proxied `connId` values, template-literal use of a proxied primitive could throw `TypeError: can't convert symbol to string` inside the swallowed catch. `connId` is normalized to a plain string at the top of the function. -- In-app GCS CORS guidance (`CORS_HELP.gcs`) updated. The `cors.json` template now includes `Authorization`, `x-amz-content-sha256`, `x-amz-date`, plus `x-amz-*` and `x-goog-*` wildcards, and adds `Range` plus the conditional `If-Match` / `If-Modified-Since` / `If-None-Match` / `If-Unmodified-Since` headers so DuckDB httpfs partial reads pass the preflight. The accompanying note explains that `responseHeader` is dual-purpose and that missing entries cause silent preflight rejections. - -### Credential prompt for private `?url=` buckets - -Auto-detected buckets opened via the `?url=` query param were always saved with `anonymous: true`. When the URL pointed at a private bucket, the first LIST request failed silently and no credential prompt opened, the only workaround was to manually edit the connection in the sidebar. - -Now `BrowserCloudAdapter.listPageS3`, `listPageGcs`, and `BrowserAzureAdapter.listPage` throw a typed `AuthRequiredError` on 401 / 403. The browser store catches it during the first LIST of an anonymous connection and surfaces it on a reactive `authRequired` field. `Sidebar.svelte` watches that field, flips the connection to `anonymous: false`, and calls `ensureCredentials()`, which opens the credential dialog so the user can paste HMAC keys or a SAS token. Public buckets keep the zero-click auto-open flow, the LIST returns 200 and `authRequired` is never triggered. - -### Arrow DECIMAL values render correctly - -`query()` / `queryCancellable()` in `query/wasm.ts` derive column types from `String(field.type)` on the Arrow schema, which emits `Decimal[10e+2]` (precision `e` signed-scale), not the DuckDB `DESCRIBE` form `DECIMAL(10,2)`. The initial `decimalScale()` regex matched only the DESCRIBE shape, so `decimalCols` stayed empty and every DECIMAL column fell through to `.get(i)` and rendered as raw `Uint32Array` / `BigInt` (for example, `"12345,0,0,0"` for `123.45`). The regex now matches both shapes, so `formatDecimal()` actually runs and cells render as scaled decimal strings. - -### Geometry column auto-detection no longer false-positives - -`findGeoColumn` matched its name hints (`geom`, `geo`, `wkb`, `shape`, ...) with `String.includes`, so a column like `n_geographic_entities` (INT) was detected as a geometry column because it contains `geo`. The fallback now tokenizes column names on snake_case / kebab-case / camelCase / numeric boundaries and requires an exact token match, eliminating the false positives. Earlier priorities (exact known names via `GEO_NAMES`, typed GEOMETRY / WKB_GEOMETRY columns) are unchanged. - -### Invalid-TIFF surface message in `CogViewer` - -`@developmentseed/geotiff` throws `Only tiff supported version:` when the first four bytes of the file do not match a TIFF / BigTIFF signature (`II*\0`, `MM\0*`, `II+\0`, `MM\0+`). This fires on files that advertise `image/tiff` but are corrupt, encrypted, or a different format entirely (GDAL returns "not recognized as being in a supported file format" on the same bytes). `CogViewer` now traps that error during the pre-flight read and shows a clear, localized `map.cogInvalidTiff` message instead of letting `COGLayer` re-invoke the loader and crash uncaught. - -### `@walkthru-earth/objex-utils` packaging and surface - -- `exports["."]` split into nested `import.types` → `./dist/index.d.ts` and `require.types` → `./dist/index.d.cts`, so CJS consumers resolve to the `.d.cts` emitted by `tsup`. `publint` now reports "All good!" on the package build. -- New public re-exports: `QuerySource`, `AccessMode`, `AccessModeInput`, `getAccessMode`, `isPubliclyStreamable`, `resolveProviderEndpoint`, plus the previously-missed `exportToCsv` and `exportToJson`. -- `docs/cog.md` trimmed to only the pure, peer-dep-free helpers actually re-exported. The render-pipeline helpers (`selectCogPipeline`, `createConfigurableGetTileData`, `normalizeCogGeotiff`, `createEpsgResolver`, `fitCogBounds`, `renderNonTiledBitmap`, ...) are now explicitly called out as "not re-exported here" so consumers know to depend on the full `@walkthru-earth/objex` package if they need them. -- `docs/storage.md` documents `resolveProviderEndpoint` and the tightened GCS CORS guidance. diff --git a/CHANGELOG.md b/CHANGELOG.md index 176956a..29cb3ba 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,52 @@ # @walkthru-earth/objex +## 1.2.1 + +### Patch Changes + +- [`aa62ae4`](https://github.com/walkthru-earth/objex/commit/aa62ae4e78242ad90bff10ed058027ad61e73b85) Thanks [@yharby](https://github.com/yharby)! - v1.2.1 focuses on making authenticated reads from S3-compatible buckets actually work in the browser, and fixing a handful of smaller bugs surfaced along the way. No breaking changes. Both packages bump together via the changesets `fixed` config. + + ### Authenticated S3-compatible reads (the headline fix) + + Before: `signed-s3` connections produced `s3://bucket/key` URLs. DuckDB-WASM's httpfs and the other fetchers signed each request with `Authorization: AWS4-HMAC-SHA256 ...`. The `Authorization` header triggers a CORS preflight, and the preflight is fragile on GCS, where `responseHeader` is dual-purpose (`Access-Control-Expose-Headers` AND `Access-Control-Allow-Headers`): any request header the browser sends that is not listed is silently dropped from the preflight response, the preflight returns 200 without `Access-Control-Allow-Origin`, and the browser blocks the real request. + + After: a new `presignHttpsUrl(conn, key, expiresIn?)` helper in `storage/presign.ts` uses `aws4fetch.signQuery` to return a presigned HTTPS URL with `X-Amz-Signature` in the query string. `buildHttpsUrlAsync` and `buildDuckDbUrlAsync` (new in `utils/url.ts`) surface it to callers, and `resolveTableSourceAsync` (new in `query/source.ts`) wires it into the table-source pipeline. DuckDB httpfs and every other range-request fetcher can now issue `GET` with only a `Range` header, keeping the preflight trivial. The 7-day expiry matches the SigV4 protocol maximum, the hard cap on every provider in the registry. + + Viewers migrated to await the async builders so their external fetchers receive a self-authenticating URL: `TableViewer` (via `resolveTableSourceAsync`, re-populates the editor only if the user has not edited the generated SQL during the await), `CogViewer`, `CopcViewer`, `ArchiveViewer`, `FlatGeobufViewer`, `PmtilesViewer`, `StacMapViewer`, `CodeViewer`, `ZarrViewer`, `ZarrMapViewer`. `PmtilesMapView` drops its unused sync import. + + `configureStorage(conn, connId, sourceRef?)` in `query/wasm.ts` now short-circuits the full `SET s3_access_key_id / secret / region / endpoint / url_style` block whenever the source ref points at a presigned HTTPS URL (`isHttpsSourceRef(ref)`). Every caller threads the ref or raw SQL through: schema / row-count / CRS probes pass `source.ref`, data-query paths (`query`, `queryForMap`, `queryCancellable`, `queryForMapCancellable`) pass the raw SQL (the regex matches `read_parquet('https://...')` embedded in SQL too). Net effect: one worker round-trip saved per query on every presigned tab, not just at tab open. + + Secondary fixes kept from the same workstream: + + - `configureStorage` falls back to `resolveProviderEndpoint()` when the connection's `endpoint` field is empty and the provider is not plain S3. Covers GCS, DO Spaces, Wasabi, B2, Storj, Contabo, Hetzner, Linode, OVHcloud, so auto-detected `?url=` connections that omit the endpoint still route DuckDB to the correct host on the `s3://` fallback path. + - `configureStorage` hardened against Svelte-proxied `connId` values, template-literal use of a proxied primitive could throw `TypeError: can't convert symbol to string` inside the swallowed catch. `connId` is normalized to a plain string at the top of the function. + - In-app GCS CORS guidance (`CORS_HELP.gcs`) updated. The `cors.json` template now includes `Authorization`, `x-amz-content-sha256`, `x-amz-date`, plus `x-amz-*` and `x-goog-*` wildcards, and adds `Range` plus the conditional `If-Match` / `If-Modified-Since` / `If-None-Match` / `If-Unmodified-Since` headers so DuckDB httpfs partial reads pass the preflight. The accompanying note explains that `responseHeader` is dual-purpose and that missing entries cause silent preflight rejections. + + ### Credential prompt for private `?url=` buckets + + Auto-detected buckets opened via the `?url=` query param were always saved with `anonymous: true`. When the URL pointed at a private bucket, the first LIST request failed silently and no credential prompt opened, the only workaround was to manually edit the connection in the sidebar. + + Now `BrowserCloudAdapter.listPageS3`, `listPageGcs`, and `BrowserAzureAdapter.listPage` throw a typed `AuthRequiredError` on 401 / 403. The browser store catches it during the first LIST of an anonymous connection and surfaces it on a reactive `authRequired` field. `Sidebar.svelte` watches that field, flips the connection to `anonymous: false`, and calls `ensureCredentials()`, which opens the credential dialog so the user can paste HMAC keys or a SAS token. Public buckets keep the zero-click auto-open flow, the LIST returns 200 and `authRequired` is never triggered. + + ### Arrow DECIMAL values render correctly + + `query()` / `queryCancellable()` in `query/wasm.ts` derive column types from `String(field.type)` on the Arrow schema, which emits `Decimal[10e+2]` (precision `e` signed-scale), not the DuckDB `DESCRIBE` form `DECIMAL(10,2)`. The initial `decimalScale()` regex matched only the DESCRIBE shape, so `decimalCols` stayed empty and every DECIMAL column fell through to `.get(i)` and rendered as raw `Uint32Array` / `BigInt` (for example, `"12345,0,0,0"` for `123.45`). The regex now matches both shapes, so `formatDecimal()` actually runs and cells render as scaled decimal strings. + + ### Geometry column auto-detection no longer false-positives + + `findGeoColumn` matched its name hints (`geom`, `geo`, `wkb`, `shape`, ...) with `String.includes`, so a column like `n_geographic_entities` (INT) was detected as a geometry column because it contains `geo`. The fallback now tokenizes column names on snake_case / kebab-case / camelCase / numeric boundaries and requires an exact token match, eliminating the false positives. Earlier priorities (exact known names via `GEO_NAMES`, typed GEOMETRY / WKB_GEOMETRY columns) are unchanged. + + ### Invalid-TIFF surface message in `CogViewer` + + `@developmentseed/geotiff` throws `Only tiff supported version:` when the first four bytes of the file do not match a TIFF / BigTIFF signature (`II*\0`, `MM\0*`, `II+\0`, `MM\0+`). This fires on files that advertise `image/tiff` but are corrupt, encrypted, or a different format entirely (GDAL returns "not recognized as being in a supported file format" on the same bytes). `CogViewer` now traps that error during the pre-flight read and shows a clear, localized `map.cogInvalidTiff` message instead of letting `COGLayer` re-invoke the loader and crash uncaught. + + ### `@walkthru-earth/objex-utils` packaging and surface + + - `exports["."]` split into nested `import.types` → `./dist/index.d.ts` and `require.types` → `./dist/index.d.cts`, so CJS consumers resolve to the `.d.cts` emitted by `tsup`. `publint` now reports "All good!" on the package build. + - New public re-exports: `QuerySource`, `AccessMode`, `AccessModeInput`, `getAccessMode`, `isPubliclyStreamable`, `resolveProviderEndpoint`, plus the previously-missed `exportToCsv` and `exportToJson`. + - `docs/cog.md` trimmed to only the pure, peer-dep-free helpers actually re-exported. The render-pipeline helpers (`selectCogPipeline`, `createConfigurableGetTileData`, `normalizeCogGeotiff`, `createEpsgResolver`, `fitCogBounds`, `renderNonTiledBitmap`, ...) are now explicitly called out as "not re-exported here" so consumers know to depend on the full `@walkthru-earth/objex` package if they need them. + - `docs/storage.md` documents `resolveProviderEndpoint` and the tightened GCS CORS guidance. + ## 1.2.0 ### Minor Changes diff --git a/package.json b/package.json index 1af8549..3640873 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@walkthru-earth/objex", - "version": "1.2.0", + "version": "1.2.1", "description": "Svelte 5 components and utilities for exploring geospatial object storage — S3, GCS, Azure, R2", "author": "Youssef Harby ", "license": "CC-BY-4.0", diff --git a/packages/objex-utils/CHANGELOG.md b/packages/objex-utils/CHANGELOG.md index 1030f0a..37abc22 100644 --- a/packages/objex-utils/CHANGELOG.md +++ b/packages/objex-utils/CHANGELOG.md @@ -1,5 +1,52 @@ # @walkthru-earth/objex-utils +## 1.2.1 + +### Patch Changes + +- [`aa62ae4`](https://github.com/walkthru-earth/objex/commit/aa62ae4e78242ad90bff10ed058027ad61e73b85) Thanks [@yharby](https://github.com/yharby)! - v1.2.1 focuses on making authenticated reads from S3-compatible buckets actually work in the browser, and fixing a handful of smaller bugs surfaced along the way. No breaking changes. Both packages bump together via the changesets `fixed` config. + + ### Authenticated S3-compatible reads (the headline fix) + + Before: `signed-s3` connections produced `s3://bucket/key` URLs. DuckDB-WASM's httpfs and the other fetchers signed each request with `Authorization: AWS4-HMAC-SHA256 ...`. The `Authorization` header triggers a CORS preflight, and the preflight is fragile on GCS, where `responseHeader` is dual-purpose (`Access-Control-Expose-Headers` AND `Access-Control-Allow-Headers`): any request header the browser sends that is not listed is silently dropped from the preflight response, the preflight returns 200 without `Access-Control-Allow-Origin`, and the browser blocks the real request. + + After: a new `presignHttpsUrl(conn, key, expiresIn?)` helper in `storage/presign.ts` uses `aws4fetch.signQuery` to return a presigned HTTPS URL with `X-Amz-Signature` in the query string. `buildHttpsUrlAsync` and `buildDuckDbUrlAsync` (new in `utils/url.ts`) surface it to callers, and `resolveTableSourceAsync` (new in `query/source.ts`) wires it into the table-source pipeline. DuckDB httpfs and every other range-request fetcher can now issue `GET` with only a `Range` header, keeping the preflight trivial. The 7-day expiry matches the SigV4 protocol maximum, the hard cap on every provider in the registry. + + Viewers migrated to await the async builders so their external fetchers receive a self-authenticating URL: `TableViewer` (via `resolveTableSourceAsync`, re-populates the editor only if the user has not edited the generated SQL during the await), `CogViewer`, `CopcViewer`, `ArchiveViewer`, `FlatGeobufViewer`, `PmtilesViewer`, `StacMapViewer`, `CodeViewer`, `ZarrViewer`, `ZarrMapViewer`. `PmtilesMapView` drops its unused sync import. + + `configureStorage(conn, connId, sourceRef?)` in `query/wasm.ts` now short-circuits the full `SET s3_access_key_id / secret / region / endpoint / url_style` block whenever the source ref points at a presigned HTTPS URL (`isHttpsSourceRef(ref)`). Every caller threads the ref or raw SQL through: schema / row-count / CRS probes pass `source.ref`, data-query paths (`query`, `queryForMap`, `queryCancellable`, `queryForMapCancellable`) pass the raw SQL (the regex matches `read_parquet('https://...')` embedded in SQL too). Net effect: one worker round-trip saved per query on every presigned tab, not just at tab open. + + Secondary fixes kept from the same workstream: + + - `configureStorage` falls back to `resolveProviderEndpoint()` when the connection's `endpoint` field is empty and the provider is not plain S3. Covers GCS, DO Spaces, Wasabi, B2, Storj, Contabo, Hetzner, Linode, OVHcloud, so auto-detected `?url=` connections that omit the endpoint still route DuckDB to the correct host on the `s3://` fallback path. + - `configureStorage` hardened against Svelte-proxied `connId` values, template-literal use of a proxied primitive could throw `TypeError: can't convert symbol to string` inside the swallowed catch. `connId` is normalized to a plain string at the top of the function. + - In-app GCS CORS guidance (`CORS_HELP.gcs`) updated. The `cors.json` template now includes `Authorization`, `x-amz-content-sha256`, `x-amz-date`, plus `x-amz-*` and `x-goog-*` wildcards, and adds `Range` plus the conditional `If-Match` / `If-Modified-Since` / `If-None-Match` / `If-Unmodified-Since` headers so DuckDB httpfs partial reads pass the preflight. The accompanying note explains that `responseHeader` is dual-purpose and that missing entries cause silent preflight rejections. + + ### Credential prompt for private `?url=` buckets + + Auto-detected buckets opened via the `?url=` query param were always saved with `anonymous: true`. When the URL pointed at a private bucket, the first LIST request failed silently and no credential prompt opened, the only workaround was to manually edit the connection in the sidebar. + + Now `BrowserCloudAdapter.listPageS3`, `listPageGcs`, and `BrowserAzureAdapter.listPage` throw a typed `AuthRequiredError` on 401 / 403. The browser store catches it during the first LIST of an anonymous connection and surfaces it on a reactive `authRequired` field. `Sidebar.svelte` watches that field, flips the connection to `anonymous: false`, and calls `ensureCredentials()`, which opens the credential dialog so the user can paste HMAC keys or a SAS token. Public buckets keep the zero-click auto-open flow, the LIST returns 200 and `authRequired` is never triggered. + + ### Arrow DECIMAL values render correctly + + `query()` / `queryCancellable()` in `query/wasm.ts` derive column types from `String(field.type)` on the Arrow schema, which emits `Decimal[10e+2]` (precision `e` signed-scale), not the DuckDB `DESCRIBE` form `DECIMAL(10,2)`. The initial `decimalScale()` regex matched only the DESCRIBE shape, so `decimalCols` stayed empty and every DECIMAL column fell through to `.get(i)` and rendered as raw `Uint32Array` / `BigInt` (for example, `"12345,0,0,0"` for `123.45`). The regex now matches both shapes, so `formatDecimal()` actually runs and cells render as scaled decimal strings. + + ### Geometry column auto-detection no longer false-positives + + `findGeoColumn` matched its name hints (`geom`, `geo`, `wkb`, `shape`, ...) with `String.includes`, so a column like `n_geographic_entities` (INT) was detected as a geometry column because it contains `geo`. The fallback now tokenizes column names on snake_case / kebab-case / camelCase / numeric boundaries and requires an exact token match, eliminating the false positives. Earlier priorities (exact known names via `GEO_NAMES`, typed GEOMETRY / WKB_GEOMETRY columns) are unchanged. + + ### Invalid-TIFF surface message in `CogViewer` + + `@developmentseed/geotiff` throws `Only tiff supported version:` when the first four bytes of the file do not match a TIFF / BigTIFF signature (`II*\0`, `MM\0*`, `II+\0`, `MM\0+`). This fires on files that advertise `image/tiff` but are corrupt, encrypted, or a different format entirely (GDAL returns "not recognized as being in a supported file format" on the same bytes). `CogViewer` now traps that error during the pre-flight read and shows a clear, localized `map.cogInvalidTiff` message instead of letting `COGLayer` re-invoke the loader and crash uncaught. + + ### `@walkthru-earth/objex-utils` packaging and surface + + - `exports["."]` split into nested `import.types` → `./dist/index.d.ts` and `require.types` → `./dist/index.d.cts`, so CJS consumers resolve to the `.d.cts` emitted by `tsup`. `publint` now reports "All good!" on the package build. + - New public re-exports: `QuerySource`, `AccessMode`, `AccessModeInput`, `getAccessMode`, `isPubliclyStreamable`, `resolveProviderEndpoint`, plus the previously-missed `exportToCsv` and `exportToJson`. + - `docs/cog.md` trimmed to only the pure, peer-dep-free helpers actually re-exported. The render-pipeline helpers (`selectCogPipeline`, `createConfigurableGetTileData`, `normalizeCogGeotiff`, `createEpsgResolver`, `fitCogBounds`, `renderNonTiledBitmap`, ...) are now explicitly called out as "not re-exported here" so consumers know to depend on the full `@walkthru-earth/objex` package if they need them. + - `docs/storage.md` documents `resolveProviderEndpoint` and the tightened GCS CORS guidance. + ## 1.2.0 ### Minor Changes diff --git a/packages/objex-utils/package.json b/packages/objex-utils/package.json index d846350..ad37a26 100644 --- a/packages/objex-utils/package.json +++ b/packages/objex-utils/package.json @@ -1,6 +1,6 @@ { "name": "@walkthru-earth/objex-utils", - "version": "1.2.0", + "version": "1.2.1", "description": "Pure TypeScript utilities from objex — WKB parser, GeoArrow builder, storage URL parser, file type registry", "author": "Youssef Harby ", "license": "CC-BY-4.0",