Skip to content
Open
167 changes: 111 additions & 56 deletions doc/export-import.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ await index.export(async function(key, data){
Import from folder `/export/` into an `Index` or `Document-Index`:

```js
const index = new Index({/* keep old config and place it here */});
// Config is restored automatically from the export payload
const index = new Index({});

const files = await fs.readdir("./export/");
for(let i = 0; i < files.length; i++){
Expand All @@ -24,22 +25,26 @@ for(let i = 0; i < files.length; i++){
}
```

> You'll need to use the same configuration as you used before the export. Any changes on the configuration needs to be re-indexed.
The export payload includes a `.cfg` key that carries all index configuration (tokenizer, encoder, resolution, context, score, etc). A plain `new Index({})` or `new Document({})` is sufficient — no need to repeat the original options.

> When a custom encoder or score function is defined **inline** in the original config it is serialized as source text and reconstructed on import. If the function closes over outer variables those bindings **will not** be available after restore — keep any required values inside the function body.

> The feature "fastupdate" is automatically disabled on import.

<a name="serialize"></a>
## Fast-Boot Serialization for Server-Side-Rendering (PHP, Python, Ruby, Rust, Java, Go, Node.js, ...)

> This is an experimental feature with limited support which probably might drop in future release. You're welcome to give some feedback.

When using Server-Side-Rendering you can create a different export which instantly boot up. Especially when using Server-side rendered content, this could help to restore a __<u>static</u>__ index on page load. Document-Indexes aren't supported yet for this method.
When using Server-Side-Rendering you can create a different export which instantly boot up. Especially when using Server-side rendered content, this could help to restore a __<u>static</u>__ index on page load.

> When your index is too large you should use the default export/import mechanism.

You'll need Javascript to create the serialized output. Alternatively just create a small Node.js script to build the output.

As the first step populate the FlexSearch index with your contents.

You have two options:
You have three options:

### 1. Create a function as string

Expand Down Expand Up @@ -70,8 +75,6 @@ inject(index);

That's it.

> You'll need to use the same configuration as you used before the export. Any changes on the configuration needs to be re-indexed.

### 2. Create just a function body as string

Alternatively you can use lazy function declaration by passing `false` to the serialize function:
Expand Down Expand Up @@ -101,92 +104,144 @@ const index = new Index();
inject(index);
```

### 3. Self-contained inject (config embedded)

<a name="export"></a>
Pass `true` as the second argument to embed the index configuration inside the serialized output. The restored index needs no external config at all:

## Export / Import (In-Memory)
```js
const fn_body = index.serialize(false, true);
const index2 = new Function("FlexSearch", fn_body)(FlexSearch);
```

### Node.js
This is the recommended approach when the index was built with custom options (custom encoder, score function, tokenizer, etc.) and you cannot guarantee the consumer will supply the same config. The encoder and score functions are serialized as source text — the same caveat about closure variables applies as with export/import.

> Persistent-Indexes and Worker-Indexes don't support Import/Export.
<a name="document-serialize"></a>

Export an `Index` or `Document-Index` to the folder `/export/`:
## Document Fast-Boot Serialization

```js
import { promises as fs } from "fs";
Document indexes can also be serialized for fast-boot on the client side. This works similarly to Index serialization but handles multiple fields, tags, and storage.

await index.export(async function(key, data){
await fs.writeFile("./export/" + key, data, "utf8");
});
### Serialize a Document Index

```js
const fn_string = document.serialize();
```

Import from folder `/export/` into an `Index` or `Document-Index`:
This produces a function string that looks like:

```js
const index = new Index({/* keep old config and place it here */});

const files = await fs.readdir("./export/");
for(let i = 0; i < files.length; i++){
const data = await fs.readFile("./export/" + files[i], "utf8");
await index.import(files[i], data);
function inject(doc){
doc.reg = new Set([/* ... */]);
doc.index.get("fieldName").map = new Map([/* ... */]);
doc.index.get("fieldName").ctx = new Map([/* ... */]);
// ... for each field
}
```

> You'll need to use the same configuration as you used before the export. Any changes on the configuration needs to be re-indexed.
### Restore the serialized Document

### Browser
**Option A — self-contained inject (config embedded):**

```js
index.export(function(key, data){

// you need to store both the key and the data!
// e.g. use the key for the filename and save your data

localStorage.setItem(key, data);
});
const fn_body = document.serialize(false, false, true);
const doc = new Function("FlexSearch", fn_body)(FlexSearch);

// Ready to search immediately
const results = doc.search("your query");
```

> The size of the export corresponds to the memory consumption of the library. To reduce export size you have to use a configuration which has less memory footprint (use the table at the bottom to get information about configs and its memory allocation).
Pass `true` as the third argument to embed all field configuration (encoders, tokenizers, score functions, etc.) in the serialized output. `FlexSearch` must be in scope when the function runs.

When your save routine runs asynchronously you have to use `async/await` or return a promise:
**Option B — inject into a pre-created document (no config needed):**

```js
index.export(function(key, data){

return new Promise(function(resolve){

// do the saving as async
const fn_body = document.serialize(false);
const inject = new Function("doc", fn_body);

resolve();
});
});
// A plain new Document({}) is enough — config is read from the serialized data
const doc = new Document({});
inject(doc);

// Ready to search
const results = doc.search("your query");
```

Before you can import data, you need to create your index first. For document indexes provide the same document descriptor you used when export the data. This configuration isn't stored in the export.
### Without function wrapper

Get just the body if you want to wrap it differently:

```js
const index = new Index({/* keep old config and place it here */});
const fn_body = document.serialize(false);
const inject = new Function("doc", fn_body);
```

To import the data just pass a key and data:
## Bulk Export / Import

Use the bulk export APIs when you want all index data in a single payload for transport or storage:

```js
// Export uncompressed (returns JSON string)
const json = await index.exportIndexBulk();

// Export compressed (returns gzip Uint8Array)
const compressed = await index.exportIndexBulk(true);
```
const data = localStorage.getItem(key);
index.import(key, data);

```js
// Import uncompressed JSON string
const restored = new Index({});
await restored.importIndexBulk(json);

// Import compressed Uint8Array
const restored2 = new Index({});
await restored2.importIndexBulk(compressed, true);
```

You need to import every key! Otherwise, your index does not work. You need to store the keys from the export and use this keys for the import (the order of the keys can differ).
Same pattern for `Document`:

> The feature "fastupdate" is automatically disabled on import.
```js
const json = await doc.exportDocumentBulk();
const docRestored = new Document({});
await docRestored.importDocumentBulk(json);
```

These methods collect all export data into a Map, serialize to JSON, and optionally compress with gzip. This leverages the same bulk import support and provides a simple, maintainable approach.

This is just for demonstration and is not recommended, because you might have other keys in your localStorage which aren't supported as an import:
### Bulk import convenience

`import()` also accepts a full payload map or entries array and loops internally:

```js
var keys = Object.keys(localStorage);
const payload = new Map();
await index.export((key, data) => payload.set(key, data));

for(let i = 0, key, data; i < keys.length; i++){
key = keys[i]
data = localStorage.getItem(key);
index.import(key, data);
}
const index2 = new Index({});
index2.import(payload); // or index2.import(Array.from(payload.entries()))
```

### Utility helpers for generic strings

`compress()` and `decompress()` stay as convenience helpers for string payloads (for example serialized fast-boot function strings):

```js
import { compress, decompress } from "flexsearch";

const fnString = index.serialize(false);
const compressed = await compress(fnString);
const restored = await decompress(compressed);
```

#### API

| Function | Signature | Returns |
|---|---|---|
| `exportIndexBulk` | `(compressed?: boolean) => Promise<string \| Uint8Array>` | JSON string (uncompressed) or Uint8Array (compressed) |
| `importIndexBulk` | `(source: string \| Uint8Array, compressed?: boolean) => Promise<void>` | Restores from bulk payload |
| `exportDocumentBulk` | `(compressed?: boolean) => Promise<string \| Uint8Array>` | JSON string (uncompressed) or Uint8Array (compressed) |
| `importDocumentBulk` | `(source: string \| Uint8Array, compressed?: boolean) => Promise<void>` | Restores from bulk payload |
| `import` | `(payload: Map<string, string> \| Array<[string, string]>) => void` | Bulk import convenience |
| `serialize` | **Index:** `(withFunctionWrapper?: boolean, withCfg?: boolean) => SerializedFunctionString` | **Index:** Fast-boot function string or body |
| | **Document:** `(withFunctionWrapper?: boolean, withCompression?: boolean, withCfg?: boolean) => SerializedFunctionString \| Promise<Uint8Array>` | **Document:** Fast-boot function string/body or compressed data |
| `compress` | `(data: string) => Promise<Uint8Array>` | Compress string data |
| `decompress` | `(data: Uint8Array) => Promise<string>` | Decompress to string |

33 changes: 33 additions & 0 deletions index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ declare module "flexsearch" {
export type Limit = number;
export type ExportHandler = (key: string, data: string) => void;
export type ExportHandlerAsync = (key: string, data: string) => Promise<void>;
export type ExportEntries = Array<[string, string]>;
export type ExportMap = Map<string, string>;
export type CompressedSource = Uint8Array | ArrayBuffer | ReadableStream<Uint8Array>;
export type AsyncCallback<T> = (result?: T) => void;

/************************************/
Expand Down Expand Up @@ -148,6 +151,20 @@ declare module "flexsearch" {
LatinDefault: EncoderOptions
};

/**
* Compress a string using gzip compression
* @param data - String data to compress
* @returns Promise that resolves to compressed Uint8Array
*/
export function compress(data: string): Promise<Uint8Array>;

/**
* Decompress gzip-compressed data
* @param data - Compressed data as Uint8Array
* @returns Promise that resolves to decompressed string
*/
export function decompress(data: Uint8Array): Promise<string>;

/**
* These options will determine how the contents will be indexed.
*
Expand Down Expand Up @@ -276,8 +293,15 @@ declare module "flexsearch" {
export(handler: ExportHandlerAsync): Promise<void>;

import(key: string, data: string): void;
import(payload: ExportMap): void;
import(payload: ExportEntries): void;

exportIndexBulk(compressed?: boolean): Promise<string | Uint8Array>;
importIndexBulk(source: string | Uint8Array, compressed?: boolean): Promise<void>;


serialize(with_function_wrapper?: boolean): SerializedFunctionString;
serialize(with_function_wrapper: boolean, with_cfg: boolean): SerializedFunctionString;

// Persistent Index
mount(db: StorageInterface): Promise<void>;
Expand Down Expand Up @@ -746,6 +770,15 @@ declare module "flexsearch" {
export(handler: ExportHandlerAsync): Promise<void>;

import(key: string, data: string): void;
import(payload: ExportMap): void;
import(payload: ExportEntries): void;

exportDocumentBulk(compressed?: boolean): Promise<string | Uint8Array>;
importDocumentBulk(source: string | Uint8Array, compressed?: boolean): Promise<void>;


serialize(with_function_wrapper?: boolean, compress?: boolean): SerializedFunctionString | Promise<Uint8Array>;
serialize(with_function_wrapper: boolean, compress: boolean, with_cfg: boolean): SerializedFunctionString | Promise<Uint8Array>;

// Persistent Index
mount<S = StorageInterface<D>>(db: S): Promise<void>;
Expand Down
Loading