Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .env.rag.example
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,10 @@ S3_ACCOUNT1_SCHEDULES=
#SERPAPI_QUERIES="OpenAI news, Bitcoin price, Tesla updates"
#SERPAPI_SCHEDULES=60

# SLAB CONNECTORS (optional):
#SLAB1_API_TOKEN=your-slab-api-token
#SLAB1_SCHEDULES=3600

# WEB CONNECTORS (optional):
#WEB1_URLS=https://example.com/page1,https://example.com/page2
#WEB1_SCHEDULES=60
Expand Down
36 changes: 30 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,6 @@ The connector has the following configuration options:
# config.yaml

sources:
-
- type: "s3" # must be s3
name: "account1" # arbitrary name for the connector, will be stored in metadata
config:
Expand All @@ -138,7 +137,7 @@ sources:
use_ssl: "${S3_ACCOUNT1_USE_SSL}" # use ssl for s3 connection, can be True or False
buckets: "${S3_ACCOUNT1_BUCKETS}" # single entry or comma-separated list i.e. bucket1,bucket2
schedules: "${S3_ACCOUNT1_SCHEDULES}" # single entry or comma-separated list i.e. 3600,60

- type: "s3"
name: "account2"
config:
Expand All @@ -150,7 +149,7 @@ sources:
...
```

````dotenv
```dotenv
# .env.rag

S3_ACCOUNT1_ENDPOINT=https://s3.amazonaws.com
Expand All @@ -160,7 +159,7 @@ S3_ACCOUNT1_REGION=us-east-1
S3_ACCOUNT1_USE_SSL=True
S3_ACCOUNT1_BUCKETS=bucket1,bucket2
S3_ACCOUNT1_SCHEDULES=3600,60
````
```

### MediaWiki Connector

Expand Down Expand Up @@ -199,7 +198,7 @@ MEDIAWIKI1_SCHEDULES=3600
# Only needed for private wikis requiring login:
#MEDIAWIKI1_USERNAME=your-bot-username
#MEDIAWIKI1_PASSWORD=your-bot-password
````
```

### SerpAPI Connector

Expand Down Expand Up @@ -232,7 +231,32 @@ sources:
SERPAPI1_KEY=xxxx
SERPAPI1_QUERIES=aaa
SERPAPI1_SCHEDULES=3600
````
```

### Slab Connector

The Slab connector ingests posts from a [Slab](https://slab.com/) knowledge base via the GraphQL API.
When `topic_ids` is configured, only posts belonging to those topics are ingested.
When omitted, all organisation posts are fetched using cursor-based pagination.

```yaml
# config.yaml

sources:
- type: "slab"
name: "slab1"
config:
api_token: "${SLAB1_API_TOKEN}"
# topic_ids: "topic_abc123,topic_def456" # optional
schedules: "${SLAB1_SCHEDULES}"
```

```dotenv
# .env.rag

SLAB1_API_TOKEN=your-slab-api-token
SLAB1_SCHEDULES=3600
```

### Web Connector

Expand Down
11 changes: 11 additions & 0 deletions config.yaml.example
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,17 @@ sources:
# queries: "${SERPAPI_QUERIES}"
# schedules: "${SERPAPI_SCHEDULES}"


#- type: "slab"
# name: "slab1"
# config:
# api_token: "${SLAB1_API_TOKEN}"
# # topic_ids: "topic_abc123,topic_def456" # optional
# max_retries: 3 # optional, default 3
# retry_delay: 2 # optional, seconds between retries (default 2)
# search_batch_size: 100 # optional, posts per search page (default 100)
# schedules: "${SLAB1_SCHEDULES}"

# Web scraper — URLs mode
#- type: "web"
# name: "web1"
Expand Down