Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions tutorials/pulseaudio/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Panda
styled-system
styled-system-studio
32 changes: 32 additions & 0 deletions tutorials/pulseaudio/.oakappignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Python virtual environments
venv/
.venv/

# Node.js
# ignore node_modules, it will be reinstalled in the container
node_modules/

# Multimedia files
media/

# Documentation
README.md

# VCS
.git/
.github/
.gitlab/

# The following files are ignored by default
# uncomment a line if you explicitly need it

# !*.oakapp

# Python
# !**/.mypy_cache/
# !**/.ruff_cache/

# IDE files
# !**/.idea
# !**/.vscode
# !**/.zed
156 changes: 156 additions & 0 deletions tutorials/pulseaudio/README.md
Comment thread
TamseSaso marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# Audio Recorder (PulseAudio)

This example demonstrates how to record audio directly on a Luxonis device using **PulseAudio (`parec`)** while streaming video from the device cameras.\
The recording is stored on the device and can be downloaded through the web interface.

______________________________________________________________________
Comment thread
TamseSaso marked this conversation as resolved.
Outdated

## Demo
Comment thread
TamseSaso marked this conversation as resolved.

When running, the frontend shows:

- live video stream from the device
- controls to start/stop recording
- ability to download the last recorded audio file

______________________________________________________________________

## Usage

Running this example requires a **Luxonis device connected to your network**.\
Refer to the official documentation if you haven’t set up your device yet:

https://docs.luxonis.com/software-v3/

This example runs entirely on the device in **Standalone mode**.

______________________________________________________________________

## Available Parameters

```python
-d DEVICE, --device DEVICE
Optional name, DeviceID or IP of the camera to connect to. (default: None)

-fps FPS_LIMIT, --fps_limit FPS_LIMIT
FPS limit for the video stream. (default: 30)

--audio_device AUDIO_DEVICE
Optional PulseAudio source name (e.g. regular0, regular1, regular2).
```

______________________________________________________________________

## Standalone Mode (RVC4 only)

In standalone mode the application runs fully on the device.\
The frontend and backend services are served from the device itself.

To run this example you need the **oakctl** tool installed.

Installation instructions:

https://docs.luxonis.com/software-v3/oak-apps/oakctl

______________________________________________________________________

## Running the Example
Comment thread
TamseSaso marked this conversation as resolved.
Outdated

### Connect to the device

```bash
oakctl connect <DEVICE_IP>
```

### Run the application

```bash
oakctl app run .
```

This will build and deploy the application to the device.

______________________________________________________________________

## Audio Recording

The example records audio using **PulseAudio** via the `parec` command.

Recordings are stored on the device under:

```path
/data/recordings
```

Each recording is saved as a WAV file:

```path
recording_<timestamp>.wav
```

______________________________________________________________________

## Audio Sources

Depending on the device configuration, multiple PulseAudio sources may be available.

Common examples:

```txt
regular0 - deep buffer stream with echo cancellation
regular1 - raw audio without post-processing
regular2 - low latency stream
```

The source can be selected via the `--audio_device` argument.

______________________________________________________________________

## Frontend Controls

| Control | Description |
| --------------- | ------------------------------------ |
| Start Recording | Starts audio recording on the device |
| Stop | Stops recording |
| Download | Downloads the most recent recording |

The **Download** button is disabled while recording to prevent incomplete files from being retrieved.

______________________________________________________________________

## How It Works

The backend provides several services exposed to the frontend:

| Service | Purpose |
| ------------------ | ------------------------------------------------ |
| Start Recording | Starts the PulseAudio recording process |
| Stop Recording | Stops recording and saves the file |
| List Recordings | Lists available recordings |
| Download Recording | Returns the selected recording encoded in base64 |

Audio recording is implemented using:

```txt
parec
```

which connects to the device’s PulseAudio server.

______________________________________________________________________

## File Structure
Comment thread
TamseSaso marked this conversation as resolved.
Outdated

```txt
backend/
src/
main.py
utils/
audioRecorder.py
download.py
arguments.py

frontend/
src/
App.tsx
```
3 changes: 3 additions & 0 deletions tutorials/pulseaudio/backend-run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/sh
echo "Starting Backend"
exec python3.12 /app/backend/src/main.py
1 change: 1 addition & 0 deletions tutorials/pulseaudio/backend/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
depthai == 3.3.0
95 changes: 95 additions & 0 deletions tutorials/pulseaudio/backend/src/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
import logging as log

import depthai as dai

from utils.arguments import initialize_argparser
from utils.audioRecorder import AudioRecorder
from utils.download import list_recordings_service, download_recording_service


log.basicConfig(level=log.INFO)
logger = log.getLogger(__name__)


_, args = initialize_argparser()

visualizer = dai.RemoteConnection(serveFrontend=False)
device = dai.Device(dai.DeviceInfo(args.device)) if args.device else dai.Device()

# Audio recorder (PulseAudio via `parec`)
recorder = AudioRecorder(device=getattr(args, "audio_device", None))


def _svc(name: str, fn, err: str):
def _inner(_: object | None = None):
path = fn()
return {"ok": True, "path": str(path)} if path else {"ok": False, "error": err}

_inner.__name__ = name
return _inner


start_recording_service = _svc(
"start_recording_service", recorder.start, "Failed to start recording"
)
stop_recording_service = _svc(
"stop_recording_service", recorder.stop, "No recording in progress"
)


visualizer.registerService("Start Recording", start_recording_service)
visualizer.registerService("Stop Recording", stop_recording_service)
visualizer.registerService("List Recordings", list_recordings_service)
visualizer.registerService("Download Recording", download_recording_service)


with dai.Pipeline(device) as pipeline:
logger.info("Creating pipeline...")

sensors = device.getConnectedCameraFeatures()
primary = sensors[0].socket.name if sensors else None

for sensor in sensors:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is streaming of all sensors needed for the purpose of this example? To me it seems like it would be better to only stream just one sensor since the visual doesn't really matter here. ANd with this we save of the badnwidth being transfered from device to host

cam = pipeline.create(dai.node.Camera).build(sensor.socket)

w, h = sensor.width, sensor.height
req = (w, h) if w <= 1920 and h <= 1080 else (1920, 1080)

cam_out = cam.requestOutput(
req,
dai.ImgFrame.Type.NV12,
fps=args.fps_limit,
)

encoder = pipeline.create(dai.node.VideoEncoder)
encoder.setDefaultProfilePreset(
args.fps_limit,
dai.VideoEncoderProperties.Profile.H264_MAIN,
)
cam_out.link(encoder.input)

# Publish each camera stream under its socket name
visualizer.addTopic(sensor.socket.name, encoder.out, "images")

# Also publish the first camera under a stable name for custom frontends
if primary is not None and sensor.socket.name == primary:
visualizer.addTopic("Video", encoder.out, "images")

logger.info("Pipeline created.")

pipeline.start()
visualizer.registerPipeline(pipeline)

logger.info("Pipeline running.")

try:
while pipeline.isRunning():
if visualizer.waitKey(1) == ord("q"):
logger.info("Got 'q' key from the remote connection. Exiting...")
break
pipeline.processTasks()

finally:
if recorder.is_recording:
logger.info("Stopping recording due to shutdown...")
recorder.stop()
38 changes: 38 additions & 0 deletions tutorials/pulseaudio/backend/src/utils/arguments.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import argparse


def initialize_argparser():
"""Initialize the argument parser for the script."""
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter
)

parser.add_argument(
"-d",
"--device",
help="Optional name, DeviceID or IP of the camera to connect to.",
required=False,
default=None,
type=str,
)

parser.add_argument(
"-fps",
"--fps_limit",
help="FPS limit for the model runtime.",
required=False,
default=30,
type=int,
)

parser.add_argument(
"--audio-device",
help="Optional PulseAudio source name (e.g., regular0). If not set, system default is used.",
required=False,
default=None,
type=str,
)

args = parser.parse_args()

return parser, args
Loading