Add shortcut to toggle audio recording and REST API #1114

neo773 · 2025-01-10T00:36:38Z

/closes #1065
/claim #1065

a.mp4

Summary by CodeRabbit

Release Notes

New Features
- Added global shortcuts for starting and stopping audio recording.
- Introduced audio device management capabilities.
- Enhanced server-side audio device control with new API endpoints for starting and stopping audio devices.
Improvements
- Improved concurrency handling for audio devices using DashMap.
- Added more robust error handling for audio-related operations.
- Extended application settings with new configuration options, including audio shortcuts.
Technical Updates
- Added dashmap dependency across multiple packages.
- Updated server and client-side audio processing logic.
- Enhanced logging and error visibility during audio device management.

vercel · 2025-01-10T00:36:43Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
screenpipe	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jan 11, 2025 8:19pm

coderabbitai · 2025-01-10T00:36:46Z

Walkthrough

This pull request introduces comprehensive audio device management capabilities across the Screenpipe application. The changes span multiple components, including the Tauri app, server, and audio processing modules. Key additions include global shortcuts for starting and stopping audio recording, new API endpoints for audio device control, and a shift from SegQueue to DashMap for more robust concurrent device state management. The implementation enables users to control audio devices through both the UI and global keyboard shortcuts.

Changes

File	Change Summary
`Cargo.toml`	Added `dashmap` dependency to workspace
`screenpipe-app-tauri/app/page.tsx`	Added audio device management event listeners
`screenpipe-app-tauri/components/settings/shortcut-*`	Added audio recording shortcut configuration
`screenpipe-app-tauri/lib/api/index.ts`	Added `startAudio` and `stopAudio` methods
`screenpipe-app-tauri/lib/hooks/use-settings.tsx`	Updated `Settings` type with new audio-related fields
`screenpipe-app-tauri/src-tauri/src/main.rs`	Enhanced shortcut handling for audio control
`screenpipe-server/Cargo.toml`	Added `dashmap` dependency for server package
`screenpipe-server/src/bin/screenpipe-server.rs`	Updated audio device management using `DashMap`
`screenpipe-server/src/core.rs`	Updated audio device management using `DashMap`
`screenpipe-server/src/server.rs`	Added REST API endpoints for audio device control

Assessment against linked issues

Objective	Addressed	Explanation
Implement API to stop/start audio devices (#1065)	✅
Add shortcut to stop/start in app UI (#1065)	✅

Possibly related PRs

Vision Streaming #1110: Shared implementation of dashmap dependency across project components.

Poem

🎙️ Hear ye, hear ye, a rabbit's delight!
Audio controls now shine so bright
Shortcuts dance, devices sing
With Screenpipe's new audio wing
Recording starts, then gently stops
Our furry friend's innovation never drops! 🐰🎧

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c6303b and bd01a3f.

📒 Files selected for processing (1)

screenpipe-server/src/bin/screenpipe-server.rs (6 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: test-windows
GitHub Check: test-macos
GitHub Check: test-ubuntu
GitHub Check: test-linux
GitHub Check: test-windows

🔇 Additional comments (3)

screenpipe-server/src/bin/screenpipe-server.rs (3)

306-307: Good choice using DashMap for concurrent device state management!

The transition from SegQueue to DashMap is a solid improvement that provides:

Better concurrent access patterns with built-in synchronization

More intuitive key-value based device state management

Reduced complexity in state updates

470-478: Recording functionality properly integrates audio device management.

The continuous recording setup correctly incorporates:

Audio device control integration

Proper parameter handling

Consistent configuration management

Line range hint 539-598: Well-implemented device monitoring with proper error handling!

The implementation successfully addresses the previous review suggestions by:

Using a broadcast channel for device state updates

Implementing proper error handling in a separate handler function

Including appropriate logging for state changes and errors

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (9)

screenpipe-server/src/bin/screenpipe-server.rs (1)
368-368: Consider renaming sender_clone for clarity

The variable sender_clone is now a DashMap due to the change from SegQueue to DashMap. Renaming it to something like device_control_map would improve code readability and accurately reflect its purpose.

Apply this diff to rename the variable:
-let sender_clone = audio_devices_control.clone();
+let device_control_map = audio_devices_control.clone();
screenpipe-app-tauri/lib/api/index.ts (1)
33-33: Consistent use of double quotes in constructor

Changing the baseUrl default value to use double quotes ensures consistency with common TypeScript conventions.

Apply this diff for consistency:
-constructor(baseUrl: string = 'http://localhost:3030') {
+constructor(baseUrl: string = "http://localhost:3030") {
screenpipe-audio/src/bin/screenpipe-audio-forever.rs (1)
98-98: Consider enhancing error handling for the new parameter.

While adding the new optional parameter is correct, consider adding explicit error handling for cases where device control might be required but not provided.
-        None,
+        None.map(|ctrl| {
+            // Handle potential device control initialization errors
+            ctrl.map_err(|e| anyhow!("Failed to initialize device control: {}", e))
+        }).transpose()?,
screenpipe-app-tauri/app/page.tsx (2)
43-47: Consider caching audio devices to prevent frequent store access.

The getAudioDevices function queries the store on every invocation. Consider caching the devices or using a React ref to optimize performance.
+  const audioDevicesRef = React.useRef<string[]>([]);
   const getAudioDevices = async () => {
+    if (audioDevicesRef.current.length > 0) {
+      return audioDevicesRef.current;
+    }
     const store = await getStore();
     const devices = (await store.get("audioDevices")) as string[];
+    audioDevicesRef.current = devices;
     return devices;
   };
105-105: Remove or enhance console.log statement.

The console.log statement should either be removed or enhanced with additional context for debugging purposes.
-      console.log("audio-devices", devices);
+      console.debug("[Audio Shortcuts] Available devices:", devices);
screenpipe-audio/src/stt.rs (1)

207-216: Device state check implementation looks robust.

The implementation properly handles both active and missing device cases with appropriate debug logging. The early continue statements prevent unnecessary processing.

Consider adding metrics or structured logging to track skipped audio processing for better observability.

screenpipe-server/src/core.rs (1)

35-35: Great architectural improvement using DashMap!

The transition from SegQueue to DashMap enhances concurrent device state management:

Better support for multiple readers and writers

More efficient state lookups

Improved thread safety with built-in synchronization

Also applies to: 281-281
screenpipe-app-tauri/src-tauri/src/main.rs (1)
279-307: Consider extracting store updates to a reusable function.

The audio shortcut registration is well-implemented, but the store update logic could be refactored for better reusability:
+fn update_audio_state(app: &AppHandle, enabled: bool) -> Result<(), String> {
+    let store = get_store(app, None).map_err(|e| e.to_string())?;
+    store.set("disableAudio", !enabled);
+    store.save().map_err(|e| e.to_string())
+}

 register_shortcut(
     app,
     &config.start_audio,
     config.is_disabled("start_audio"),
     |app| {
-        let store = get_store(app, None).unwrap();
-        store.set("disableAudio", false);
-        store.save().unwrap();
+        let _ = update_audio_state(app, true);
         let _ = app.emit("shortcut-start-audio", ());
     },
 )
screenpipe-server/src/server.rs (1)
1624-1698: Consider extracting device validation to reduce duplication.

The handlers are well-implemented but contain duplicated device validation logic. Consider refactoring:
+async fn validate_audio_device(device: &AudioDevice) -> Result<(), (StatusCode, JsonResponse<Value>)> {
+    let available_devices = list_audio_devices().await.map_err(|e| {
+        (StatusCode::INTERNAL_SERVER_ERROR, 
+         JsonResponse(json!({
+             "error": format!("failed to list audio devices: {}", e),
+             "success": false
+         })))
+    })?;
+
+    if !available_devices.contains(device) {
+        return Err((StatusCode::BAD_REQUEST, 
+            JsonResponse(json!({
+                "error": format!("device not found: {}", device.name),
+                "success": false
+            }))));
+    }
+    Ok(())
+}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67f09c9 and 6fe4f1f.

📒 Files selected for processing (18)

Cargo.toml (1 hunks)
screenpipe-app-tauri/app/page.tsx (3 hunks)
screenpipe-app-tauri/components/settings/shortcut-row.tsx (3 hunks)
screenpipe-app-tauri/components/settings/shortcut-section.tsx (1 hunks)
screenpipe-app-tauri/lib/api/index.ts (2 hunks)
screenpipe-app-tauri/lib/hooks/use-settings.tsx (3 hunks)
screenpipe-app-tauri/src-tauri/src/main.rs (4 hunks)
screenpipe-audio/Cargo.toml (1 hunks)
screenpipe-audio/benches/record_and_transcribe_benchmark.rs (1 hunks)
screenpipe-audio/src/bin/screenpipe-audio-forever.rs (1 hunks)
screenpipe-audio/src/bin/screenpipe-audio.rs (1 hunks)
screenpipe-audio/src/core.rs (1 hunks)
screenpipe-audio/src/stt.rs (4 hunks)
screenpipe-audio/tests/core_tests.rs (1 hunks)
screenpipe-server/Cargo.toml (1 hunks)
screenpipe-server/src/bin/screenpipe-server.rs (7 hunks)
screenpipe-server/src/core.rs (6 hunks)
screenpipe-server/src/server.rs (12 hunks)

🔇 Additional comments (26)

screenpipe-server/src/bin/screenpipe-server.rs (7)

39-39: DashMap imported for concurrent device control management

The addition of use dashmap::DashMap; correctly incorporates DashMap for thread-safe, concurrent access to audio device control states.

306-307: Switch from SegQueue to DashMap for device control

Replacing SegQueue with DashMap allows for concurrent reads and writes to the audio device control map without explicit locking, enhancing performance and safety.

464-466: Cloning variables for scoped usage is appropriate

Cloning vad_engine and realtime_transcription_sender ensures thread-safe access within the asynchronous task.

471-474: Passing cloned audio_devices_control_recording to recording function

Passing the cloned audio_devices_control_recording to start_continuous_recording aligns with the updated concurrency model.

493-493: Including realtime_transcription_sender_clone in parameters

Passing Arc::new(realtime_transcription_sender_clone) ensures that the real-time transcription sender is correctly shared across tasks.

533-535: Initialize broadcast channel for audio device updates

The introduction of audio_devices_tx and cloning it for use in different parts of the application facilitates communication of audio device state changes.

552-572: Handle dynamic audio device state changes in spawned task

The asynchronous task correctly listens to the broadcast channel and updates audio_devices_control_for_spawn accordingly. Error handling for unavailable devices and logging enhances reliability.

screenpipe-audio/benches/record_and_transcribe_benchmark.rs (1)

28-28: Update create_whisper_channel call with new optional parameter

Adding None as the new argument aligns with the updated signature of create_whisper_channel, ensuring the function is called correctly with all required parameters.
screenpipe-app-tauri/lib/api/index.ts (4)

45-45: Properly handle API response success flags

Throwing an error when data.success is false in listPipes method ensures that failures are correctly propagated to the caller.

49-52: Log and rethrow errors in listPipes method

Adding error logging and rethrowing in the catch block aids in debugging and ensures that errors aren't silently swallowed.

54-80: Implement startAudio method with robust error handling

The new startAudio method correctly sends a POST request to the server and includes comprehensive error handling for both network and API-level errors.

81-104: 🛠️ Refactor suggestion

⚠️ Potential issue

Improve device type detection in stopAudio method

The current logic for determining the device type may fail if the device name does not include "(input)" or "(output)", or if these strings appear in the device name unintentionally. This could lead to incorrect device control commands being sent to the server.

Apply this diff to enhance the reliability of device type determination:
-const type = deviceName.includes("(input)") ? "Input" : "Output";
-const name = deviceName.replaceAll("(input)", "").replaceAll("(output)", "").trim();
+// Assume deviceName is an object with properties 'name' and 'type'
+// Pass deviceName as { name: string, type: 'Input' | 'Output' }
+const { name, type } = deviceName;
Alternatively, ensure that the device type is provided explicitly when calling stopAudio, and adjust the method signature accordingly.

Likely invalid or redundant comment.
screenpipe-app-tauri/components/settings/shortcut-section.tsx (1)

54-69: LGTM! Audio shortcut components follow established patterns.

The new ShortcutRow components for audio controls are well-integrated and maintain consistency with existing shortcut implementations.

screenpipe-audio/src/bin/screenpipe-audio.rs (1)

102-102: LGTM! Consistent implementation with robust error handling.

The None parameter addition aligns with other files, and the existing timeout handling provides good reliability.

screenpipe-app-tauri/components/settings/shortcut-row.tsx (2)

92-93: LGTM! Audio shortcut integration looks good.

The audio shortcut fields are properly integrated into the shortcut synchronization logic, maintaining consistency with the existing shortcut system.

Also applies to: 101-102, 112-113

179-179: Good addition of error logging.

Error logging enhancement improves debugging capabilities for shortcut-related issues.

screenpipe-app-tauri/lib/hooks/use-settings.tsx (1)

90-91: Audio shortcut fields properly typed and initialized.

The new audio shortcut fields are correctly added to the Settings type and initialized with empty strings in DEFAULT_SETTINGS.

Also applies to: 147-148
screenpipe-audio/src/stt.rs (1)

159-159: Verify thread safety of DashMap usage.

The addition of Arc<DashMap<AudioDevice, DeviceControl>> is a good choice for concurrent device state management. However, let's verify there are no potential deadlocks or race conditions in the codebase.

Run this script to check for potential race conditions:

✅ Verification successful

DashMap usage is thread-safe and well-implemented

The implementation follows concurrent programming best practices:

Thread-safe sharing through Arc wrapper

Atomic operations using DashMap's internal synchronization

Clean state management via broadcast channels

No potential deadlocks or race conditions detected
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Check for potential race conditions in device control access
# Look for places where DashMap is accessed without proper synchronization

# Search for DashMap usage patterns
rg -A 5 "DashMap|DeviceControl" 

# Search for potential race conditions in async contexts
ast-grep --pattern 'async fn $name($params) {
  $$$
  $map.insert($key, $value)
  $$$
  await
  $$$
  $map.remove($key)
  $$$
}'
Length of output: 14091
screenpipe-audio/src/core.rs (1)

Line range hint 53-57: LGTM! Good addition of the Debug trait.

Adding the Debug trait to DeviceControl struct enhances debugging capabilities, making it easier to inspect device states during development and troubleshooting.

screenpipe-server/src/core.rs (1)

295-304: Well-implemented device state management!

The changes effectively handle device lifecycle:

Proper iteration over active devices

Skip already handled devices to prevent duplicates

Clean removal of stopped devices

Also applies to: 314-315

screenpipe-app-tauri/src-tauri/src/main.rs (2)

79-80: LGTM! Good extension of shortcut configuration.

The new audio shortcut fields maintain consistency with existing shortcut patterns.

143-150: Well-implemented shortcut initialization!

The initialization follows the established pattern and properly handles defaults.

screenpipe-server/src/server.rs (2)

1610-1621: LGTM! Well-designed request/response structs.

The structs are well-designed with:

Proper serialization attributes

Optional fields with defaults

Clear and meaningful field names

1742-1743: LGTM! Well-structured API routes.

The new audio control routes follow REST conventions and maintain consistency with existing endpoints.
Cargo.toml (1)

39-39: LGTM! Verify version compatibility.

The addition of dashmap as a workspace dependency is appropriate for concurrent device state management. Version 6.1.0 is stable and well-tested.

Run this script to check for any newer versions with important updates:

✅ Verification successful

Version 6.1.0 is optimal ✓

The PR uses the latest stable version of dashmap, which is not yanked and actively maintained.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Check latest dashmap versions and changelogs
curl -s https://crates.io/api/v1/crates/dashmap | jq -r '.versions[0] | {num:.num, yanked:.yanked}'
Length of output: 141
screenpipe-audio/Cargo.toml (1)

68-68: LGTM! Consistent workspace dependency.

The addition of dashmap using workspace inheritance ensures version consistency across the project.

screenpipe-app-tauri/app/page.tsx

screenpipe-audio/tests/core_tests.rs

screenpipe-server/Cargo.toml

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (8)

screenpipe-app-tauri/lib/api/index.ts (1)
83-110: Reduce code duplication in audio control methods.

The stopAudio method duplicates most of the logic from startAudio. Consider extracting the common functionality into a private helper method.

Here's a suggested refactor:
+ private async controlAudio(deviceName: string, action: "start" | "stop"): Promise<void> {
+   if (!deviceName?.trim()) {
+     throw new Error("Device name cannot be empty");
+   }
+   if (!deviceName.includes("(input)") && !deviceName.includes("(output)")) {
+     throw new Error("Invalid device name format. Must include '(input)' or '(output)'");
+   }
+   try {
+     const type = deviceName.includes("(input)") ? "Input" : "Output";
+     const name = deviceName.replaceAll("(input)", "").replaceAll("(output)", "").trim();
+     const response = await fetch(`${this.baseUrl}/audio/${action}`, {
+       method: "POST",
+       headers: {
+         "Content-Type": "application/json",
+       },
+       body: JSON.stringify({
+         device_name: name,
+         device_type: type,
+       }),
+     });
+
+     if (!response.ok) {
+       throw new Error(`failed to ${action} audio: ${response.statusText}`);
+     }
+
+     const data = await response.json();
+     if (!data.success) {
+       throw new Error(`failed to ${action} audio: ${data.message}`);
+     }
+   } catch (error) {
+     console.error(`error ${action}ing audio:`, error);
+     throw error;
+   }
+ }
+
+ async startAudio(deviceName: string): Promise<void> {
+   return this.controlAudio(deviceName, "start");
+ }
+
+ async stopAudio(deviceName: string): Promise<void> {
+   return this.controlAudio(deviceName, "stop");
+ }
screenpipe-server/src/bin/screenpipe-server.rs (2)

368-368: Consider parameterizing the sleep duration before inserting into sender_clone.

The hardcoded 15-second delay might benefit from being configurable or documented to clarify its purpose, improving maintainability.

558-578: Handle potential errors from rx.recv().await to ensure task reliability.

If all senders are dropped, rx.recv().await will return Err(RecvError::Closed), causing the loop to exit. Consider handling the error to maintain the task or explicitly break if intended.
screenpipe-server/src/core.rs (1)
20-20: Remove unused import: HashSet.

The HashSet import is unused and can be safely removed to clean up the codebase.

Apply this diff to remove the unused import:
-use std::collections::{HashMap, HashSet};
+use std::collections::HashMap;
🧰 Tools

🪛 GitHub Check: test-ubuntu

[warning] 20-20:
unused import: HashSet

🪛 GitHub Check: test-linux

[warning] 20-20:
unused import: HashSet

🪛 GitHub Check: test-macos

[warning] 20-20:
unused import: HashSet
screenpipe-server/src/server.rs (4)
1630-1667: Handle the result of state.audio_devices_tx.send in start_audio_device.

The send method can return an error if there are no active subscribers or if the channel is full. Consider handling or logging the error to ensure that control messages are sent successfully.

Apply this diff to handle the potential error:
-        let _ = state.audio_devices_tx.send((device.clone(), control));
+        if let Err(e) = state.audio_devices_tx.send((device.clone(), control)) {
+            error!("Failed to send control message: {}", e);
+        }
1670-1704: Handle the result of state.audio_devices_tx.send in stop_audio_device.

Similar to start_audio_device, handle or log any errors returned by the send method to ensure reliability.

Apply this diff to handle the potential error:
-        let _ = state.audio_devices_tx.send((device.clone(), DeviceControl {
+        if let Err(e) = state.audio_devices_tx.send((device.clone(), DeviceControl {
             is_running: false,
             is_paused: false,
-        }));
+        })) {
+            error!("Failed to send control message: {}", e);
+        }
1630-1667: Consider adding unit tests for start_audio_device handler.

Adding unit tests ensures the new API endpoint behaves as expected and helps prevent future regressions.

Would you like assistance in creating unit tests for this handler?

1670-1704: Consider adding unit tests for stop_audio_device handler.

Unit tests will help verify the correctness of the stop_audio_device function and enhance code reliability.

Do you want help in writing unit tests for this handler?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fe4f1f and fd4744b.

📒 Files selected for processing (5)

Cargo.toml (1 hunks)
screenpipe-app-tauri/lib/api/index.ts (2 hunks)
screenpipe-server/src/bin/screenpipe-server.rs (6 hunks)
screenpipe-server/src/core.rs (6 hunks)
screenpipe-server/src/server.rs (14 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

Cargo.toml

🧰 Additional context used

🪛 GitHub Check: test-ubuntu

screenpipe-server/src/core.rs

[warning] 20-20:
unused import: HashSet

🪛 GitHub Check: test-linux

screenpipe-server/src/core.rs

[warning] 20-20:
unused import: HashSet

🪛 GitHub Check: test-macos

screenpipe-server/src/core.rs

[warning] 20-20:
unused import: HashSet

🪛 GitHub Actions: Linux Integration Test

screenpipe-server/src/server.rs

[error] 35-35: The name list_monitors is defined multiple times

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: test-windows
GitHub Check: test-windows

🔇 Additional comments (18)

screenpipe-app-tauri/lib/api/index.ts (2)

45-45: LGTM! Enhanced error handling.

The improved error handling with detailed logging will help with debugging issues.

Also applies to: 49-51

54-81: 🛠️ Refactor suggestion

Enhance robustness and documentation of audio device handling.

Several improvements could make this method more robust and maintainable:

The device type determination is fragile and relies on string patterns.

The method lacks input validation and documentation.

Consider these improvements:
+ /** 
+  * Starts audio recording/playback for the specified device.
+  * @param deviceName - The name of the audio device (must include "(input)" or "(output)")
+  * @throws {Error} If the device name is invalid or the API request fails
+  */
 async startAudio(deviceName: string): Promise<void> {
+  if (!deviceName?.trim()) {
+    throw new Error("Device name cannot be empty");
+  }
+  if (!deviceName.includes("(input)") && !deviceName.includes("(output)")) {
+    throw new Error("Invalid device name format. Must include '(input)' or '(output)'");
+  }
   try {
     const type = deviceName.includes("(input)") ? "Input" : "Output";
     const name = deviceName.replaceAll("(input)", "").replaceAll("(output)", "").trim();
Let's verify the device name format assumptions across the codebase:

✅ Verification successful

Input validation for audio device names is essential and aligns with codebase patterns.

The review comment's concerns about device name format validation are valid. The codebase consistently uses "(input)" and "(output)" suffixes for device names across multiple components, making the suggested input validation and documentation crucial for maintaining this convention.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for patterns of device name usage to verify the format assumptions
rg -A 2 "includes$['\"].*input|output.*['\"]$" 
Length of output: 15965
screenpipe-server/src/bin/screenpipe-server.rs (4)

39-39: LGTM!

The import of dashmap::DashMap is necessary for the updated audio device control implementation.

306-307: LGTM!

Initialization of audio_devices_control with DashMap enhances concurrent access and is correctly cloned for usage in different contexts.

470-478: LGTM!

The parameters for start_continuous_recording have been correctly updated to align with the new DashMap implementation for audio device control.

539-541: LGTM!

Creation of the audio_devices_tx broadcast channel and its cloning are properly implemented for transmitting audio device control messages.

screenpipe-server/src/core.rs (5)

6-6: LGTM!

Importing dashmap::DashMap is necessary for the new concurrent audio device control mechanism.

36-36: LGTM!

The function signature of start_continuous_recording has been appropriately updated to use DashMap for audio device control.

124-124: LGTM!

Passing audio_devices_control to create_whisper_channel ensures consistent audio device management across components.

297-297: LGTM!

The updated record_audio function signature correctly reflects the new DashMap usage.

311-331: LGTM!

Iterating over audio_devices_control using DashMap entries is appropriate for concurrent device state management.

screenpipe-server/src/server.rs (7)

50-50: LGTM!

Importing tokio::sync::broadcast is necessary for implementing the audio device control broadcast channel.

65-65: LGTM!

Updating AppState with audio_devices_tx aligns with the new broadcast mechanism for audio device control.

804-804: LGTM!

The Server struct's inclusion of audio_devices_tx is consistent with the updated architecture.

821-821: LGTM!

Constructor parameters for Server::new have been appropriately updated.

835-835: LGTM!

Initialization of Server with the new audio_devices_tx aligns with the broadcast-based design.

859-859: LGTM!

Passing audio_devices_tx into AppState ensures that audio device control messages are managed correctly.

1615-1622: LGTM!

The AudioDeviceControlRequest struct is properly defined to handle audio device control requests.

screenpipe-server/src/server.rs

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (3)

screenpipe-server/src/bin/screenpipe-server.rs (2)
306-307: Consider adding a capacity hint to DashMap initialization.

Since the number of audio devices is known in advance, you can optimize the DashMap initialization by providing an initial capacity hint to avoid reallocation.
-    let audio_devices_control = Arc::new(DashMap::new());
+    let audio_devices_control = Arc::new(DashMap::with_capacity(all_audio_devices.len()));
539-541: Consider the broadcast channel capacity.

The broadcast channel capacity of 100 might be insufficient for high-frequency device state changes, potentially leading to message drops. Consider:

Making the capacity configurable

Implementing a backpressure mechanism

Logging when messages are dropped
screenpipe-server/src/server.rs (1)
1614-1626: Add documentation and validation for API structs.

The request/response structs would benefit from:

Documentation comments explaining their purpose and usage

Validation attributes for the device_name field

Example usage in doc comments
+/// Request payload for controlling audio devices.
+/// 
+/// # Example
+/// ```
+/// let request = AudioDeviceControlRequest {
+///     device_name: "Microphone".to_string(),
+///     device_type: Some(DeviceType::Input),
+/// };
+/// ```
 #[derive(Deserialize)]
 pub struct AudioDeviceControlRequest {
+    #[serde(validate(length(min = 1)))]
     device_name: String,
     #[serde(default)]
     device_type: Option<DeviceType>,
 }

+/// Response payload for audio device control operations.
 #[derive(Serialize)]
 pub struct AudioDeviceControlResponse {
     success: bool,
     message: String,
 }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd4744b and 7cefbdd.

⛔ Files ignored due to path filters (1)

screenpipe-app-tauri/src-tauri/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (2)

screenpipe-server/src/bin/screenpipe-server.rs (6 hunks)
screenpipe-server/src/server.rs (13 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: test-windows
GitHub Check: test-ubuntu
GitHub Check: test-windows
GitHub Check: test-macos
GitHub Check: test-linux

🔇 Additional comments (2)

screenpipe-server/src/bin/screenpipe-server.rs (2)

39-40: LGTM!

The import of dashmap::DashMap is correctly placed and aligns with the architectural change from SegQueue to DashMap for concurrent device state management.

368-368: Review the 15-second delay before device control insertion.

The arbitrary 15-second delay before inserting the device control state could cause issues:

Users might attempt to control devices during this delay window

The delay doesn't account for device initialization time variations

No error handling if the device isn't ready after the delay

Consider either removing the delay or implementing a proper device readiness check.

screenpipe-server/src/bin/screenpipe-server.rs

screenpipe-server/src/server.rs

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

screenpipe-app-tauri/app/page.tsx (1)

126-145: ⚠️ Potential issue

Add consistent error handling and debouncing to stop operation.

The stop operation needs the same safeguards as the start operation:

Missing debouncing with isProcessingRef
Inconsistent Promise handling (using forEach instead of Promise.all)
No cleanup in finally block

 listen("shortcut-stop-audio", async (event) => {
+  if (isProcessingRef.current) return;
+  isProcessingRef.current = true;
   try {
     const devices = await getAudioDevices();
     const pipeApi = new PipeApi();
-    devices.forEach((device) => {
-      pipeApi.stopAudio(device);
-    });
+    await Promise.all(devices.map(device => pipeApi.stopAudio(device)));
     toast({
       title: "audio stopped",
       description: "audio has been stopped",
     });
   } catch (error) {
     console.error("error stopping audio:", error);
     toast({
       title: "error stopping audio",
       description: error instanceof Error ? error.message : "unknown error occurred",
       variant: "destructive",
     });
+  } finally {
+    isProcessingRef.current = false;
   }
 }),

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7cefbdd and 26ce6a2.

📒 Files selected for processing (5)

screenpipe-app-tauri/app/page.tsx (3 hunks)
screenpipe-server/src/core.rs (6 hunks)
screenpipe-server/src/server.rs (13 hunks)
screenpipe-server/tests/endpoint_test.rs (1 hunks)
screenpipe-server/tests/tags_test.rs (1 hunks)

🧰 Additional context used

🪛 Biome (1.9.4)

screenpipe-app-tauri/app/page.tsx

[error] 101-101: Expected a property, an expression, or a method but instead found 'const isProcessingRef = React.useRef'.

Expected a property, an expression, or a method here.

(parse)

[error] 101-101: expected , but instead found (

Remove (

(parse)

[error] 101-101: expected , but instead found ;

Remove ;

(parse)

[error] 102-102: expected , but instead found (

Remove (

(parse)

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: test-macos
GitHub Check: test-ubuntu
GitHub Check: test-windows
GitHub Check: test-linux
GitHub Check: test-windows

🔇 Additional comments (12)

screenpipe-app-tauri/app/page.tsx (4)

4-4: LGTM!

The getStore import is appropriately placed alongside the related useSettings import.

87-99: LGTM!

The formatting changes improve code readability while maintaining the same functionality.

101-124: LGTM on the start audio implementation.

The implementation includes proper debouncing and error handling with the isProcessingRef.

🧰 Tools

🪛 Biome (1.9.4)

[error] 101-101: Expected a property, an expression, or a method but instead found 'const isProcessingRef = React.useRef'.

Expected a property, an expression, or a method here.

(parse)

[error] 101-101: expected , but instead found (

Remove (

(parse)

[error] 101-101: expected , but instead found ;

Remove ;

(parse)

[error] 102-102: expected , but instead found (

Remove (

(parse)

101-145: Consider extracting shared audio operation logic.

Both start and stop operations share similar patterns. Consider extracting the common logic into a shared function to improve maintainability.
+const handleAudioOperation = async (
+  operation: 'start' | 'stop',
+  handler: (device: string) => Promise<void>
+) => {
+  if (isProcessingRef.current) return;
+  isProcessingRef.current = true;
+  try {
+    const devices = await getAudioDevices();
+    const pipeApi = new PipeApi();
+    await Promise.all(devices.map(handler));
+    toast({
+      title: `audio ${operation}ed`,
+      description: `audio has been ${operation}ed`,
+    });
+  } catch (error) {
+    console.error(`error ${operation}ing audio:`, error);
+    toast({
+      title: `error ${operation}ing audio`,
+      description: error instanceof Error ? error.message : "unknown error occurred",
+      variant: "destructive",
+    });
+  } finally {
+    isProcessingRef.current = false;
+  }
+};

-listen("shortcut-start-audio", async () => { ... }),
-listen("shortcut-stop-audio", async () => { ... }),
+listen("shortcut-start-audio", () => 
+  handleAudioOperation('start', device => pipeApi.startAudio(device))
+),
+listen("shortcut-stop-audio", () => 
+  handleAudioOperation('stop', device => pipeApi.stopAudio(device))
+),
🧰 Tools

🪛 Biome (1.9.4)

[error] 101-101: Expected a property, an expression, or a method but instead found 'const isProcessingRef = React.useRef'.

Expected a property, an expression, or a method here.

(parse)

[error] 101-101: expected , but instead found (

Remove (

(parse)

[error] 101-101: expected , but instead found ;

Remove ;

(parse)

[error] 102-102: expected , but instead found (

Remove (

(parse)
screenpipe-server/tests/tags_test.rs (1)

34-34: LGTM! Channel size is appropriate for testing.

The transition to broadcast channel with a buffer size of 1000 is suitable for test scenarios.

screenpipe-server/tests/endpoint_test.rs (1)

37-37: LGTM! Consistent with tags_test.rs implementation.

The broadcast channel initialization maintains consistency across test files.

screenpipe-server/src/core.rs (3)

5-5: LGTM! DashMap is appropriate for concurrent device state management.

Using DashMap provides better concurrent access patterns and state management capabilities compared to SegQueue.

35-35: LGTM! Parameter type update aligns with architectural changes.

The transition to DashMap for audio_devices_control enhances concurrent device state management.

123-123: LGTM! Proper integration with whisper channel.

The whisper channel correctly receives the audio_devices_control reference.
screenpipe-server/src/server.rs (3)

1614-1625: LGTM! Well-structured request/response types.

The types are properly defined with appropriate optional fields and serialization attributes.

1796-1797: 🛠️ Refactor suggestion

Add rate limiting and authentication to audio endpoints.

The audio control endpoints need protection against abuse:

Rate limiting to prevent DoS attacks

Authentication to restrict access

Request validation middleware
-        .route("/audio/start", post(start_audio_device))
-        .route("/audio/stop", post(stop_audio_device))
+        .route("/audio/start", post(start_audio_device))
+            .layer(RateLimitLayer::new(10, Duration::from_secs(60)))
+            .layer(AuthenticationLayer::new())
+        .route("/audio/stop", post(stop_audio_device))
+            .layer(RateLimitLayer::new(10, Duration::from_secs(60)))
+            .layer(AuthenticationLayer::new())
Likely invalid or redundant comment.

1627-1665: 🛠️ Refactor suggestion

Improve error handling and state validation.

The start handler needs improvements:

Validate current device state before starting

Handle broadcast send failures

Update devices_status map
 async fn start_audio_device(
     State(state): State<Arc<AppState>>,
     Json(payload): Json<AudioDeviceControlRequest>,
 ) -> Result<JsonResponse<AudioDeviceControlResponse>, (StatusCode, JsonResponse<Value>)> {
     let device = AudioDevice {
         name: payload.device_name.clone(),
         device_type: payload.device_type.unwrap_or(DeviceType::Input),
     };

+    // Check if device is already running
+    if let Some(current_state) = state.devices_status.get(&device) {
+        if current_state.is_running {
+            return Ok(JsonResponse(AudioDeviceControlResponse {
+                success: true,
+                message: format!("device already running: {}", device.name),
+            }));
+        }
+    }

     let control = DeviceControl { 
         is_running: true, 
         is_paused: false 
     };
     
-    let _ = state.audio_devices_tx.send((device.clone(), control));
+    if let Err(e) = state.audio_devices_tx.send((device.clone(), control.clone())) {
+        error!("Failed to send device control message: {}", e);
+        return Err((StatusCode::INTERNAL_SERVER_ERROR,
+            JsonResponse(json!({
+                "error": "failed to start device",
+                "success": false
+            }))));
+    }
+
+    // Update devices_status
+    state.devices_status.insert(device.clone(), control);
Likely invalid or redundant comment.

screenpipe-app-tauri/app/page.tsx

screenpipe-server/src/core.rs

louis030195 · 2025-01-10T01:58:01Z

screenpipe-audio/src/stt.rs

@@ -155,6 +156,7 @@ pub async fn create_whisper_channel(
    output_path: &PathBuf,
    vad_sensitivity: VadSensitivity,
    languages: Vec<Language>,
+    audio_devices_control: Option<Arc<DashMap<AudioDevice, DeviceControl>>>,


why is audio device control has to change speech to text code instead of audio device recording

also does it work well with the audio streaming api

if i stop a device

Yes, start_realtime_recording() gets checked by parent block if device is running.

louis030195 · 2025-01-10T17:57:08Z

will test this soon, questions:

does it allow to start/stop specific device? (it's ok if it's only in API for now and not in app UI)
how does it affects status? if i stop all audio device does it turn red?

neo773 · 2025-01-10T18:00:35Z

does it allow to start/stop specific device? (it's ok if it's only in API for now and not in app UI)

Yeah, it's granular you can stop/start specific devices.

how does it affects status? if i stop all audio device does it turn red?

No, current logic checks for both audio and video I think for it to turn red.

louis030195 · 2025-01-10T18:19:49Z

Screen.Recording.2025-01-10.at.10.16.39.AM.mov

issues i noticed, not sure if related to this PR:

screenpipe started at app boot despite i had dev mode on
stop button does not work anymore
when assigning shortcut it should disable all shortcuts while we are assigning (this is UX problem, unrelated to this PR, not big deal for now)

louis030195 · 2025-01-11T19:54:25Z

testing agai now

louis030195 · 2025-01-11T20:14:25Z

seem to work\

one issue which seem only dev i think is stop button does not work sometimes, when starting the app with bun tauri dev

would it be possible to add in the status some kind of green/red or thing saying which devices are being recorded (monitor and audio deviec)?

neo773 · 2025-01-11T20:17:14Z

seem to work\

one issue which seem only dev i think is stop button does not work sometimes, when starting the app with bun tauri dev
,
would it be possible to add in the status some kind of green/red or thing saying which devices are being recorded (monitor and audio deviec)?

I think it should be doable, the entire modal needs a redesigned UI/UX imo, current one looks very MVPish/Alpha.

As for stop button I'm not sure, the code diff doesn't really delve into that parts. it's happened to me before too (before this PR).

It's also hard to reproduce

louis030195 · 2025-01-14T02:48:51Z

/approve

algora-pbc · 2025-01-14T02:48:54Z

@louis030195: The claim has been successfully added to reward-all. You can visit your dashboard to complete the payment.

neo773 added 7 commits January 10, 2025 03:54

wip

392d664

wip

b52e3f9

works

8b86678

rm log

2585387

wip

9cc1a9b

fully works now

b720abb

add try catch

6fe4f1f

algora-pbc bot mentioned this pull request Jan 10, 2025

[bounty] add shortcut to stop start audio recording #1065

Closed

algora-pbc bot added the 🙋 Bounty claim label Jan 10, 2025

vercel bot deployed to Preview January 10, 2025 00:37 View deployment

fix startAudio api function

dd0932a

vercel bot deployed to Preview January 10, 2025 00:39 View deployment

coderabbitai bot reviewed Jan 10, 2025

View reviewed changes

screenpipe-app-tauri/app/page.tsx Show resolved Hide resolved

screenpipe-app-tauri/app/page.tsx Outdated Show resolved Hide resolved

screenpipe-audio/tests/core_tests.rs Show resolved Hide resolved

screenpipe-server/Cargo.toml Show resolved Hide resolved

Merge branch 'main' into new-shortcut-audio

fd4744b

vercel bot deployed to Preview January 10, 2025 00:44 View deployment

coderabbitai bot reviewed Jan 10, 2025

View reviewed changes

screenpipe-server/src/server.rs Outdated Show resolved Hide resolved

fix merge conflict and build

7cefbdd

vercel bot deployed to Preview January 10, 2025 00:54 View deployment

neo773 and others added 3 commits January 10, 2025 06:25

fix tests

5ef7ccf

rm unused imports

385586c

Update screenpipe-app-tauri/app/page.tsx

26ce6a2

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Jan 10, 2025

View reviewed changes

screenpipe-server/src/bin/screenpipe-server.rs Outdated Show resolved Hide resolved

screenpipe-server/src/server.rs Show resolved Hide resolved

screenpipe-server/src/server.rs Show resolved Hide resolved

screenpipe-server/src/server.rs Show resolved Hide resolved

vercel bot deployed to Preview January 10, 2025 00:57 View deployment

fix ts

8c6303b

vercel bot deployed to Preview January 10, 2025 00:59 View deployment

coderabbitai bot reviewed Jan 10, 2025

View reviewed changes

screenpipe-app-tauri/app/page.tsx Show resolved Hide resolved

screenpipe-server/src/core.rs Show resolved Hide resolved

add code rabbit suggestion

bd01a3f

neo773 changed the title ~~New shortcut audio~~ Add shortcut to toggle audio recording and REST API Jan 10, 2025

vercel bot deployed to Preview January 10, 2025 01:12 View deployment

louis030195 reviewed Jan 10, 2025

View reviewed changes

Merge branch 'main' into new-shortcut-audio

e681359

vercel bot deployed to Preview January 11, 2025 20:19 View deployment

louis030195 mentioned this pull request Jan 12, 2025

fix killing pipes #1130

Merged

louis030195 merged commit d193a5a into mediar-ai:main Jan 14, 2025
5 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add shortcut to toggle audio recording and REST API #1114

Add shortcut to toggle audio recording and REST API #1114

neo773 commented Jan 10, 2025 •

edited

Loading

vercel bot commented Jan 10, 2025 •

edited

Loading

coderabbitai bot commented Jan 10, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

louis030195 Jan 10, 2025

louis030195 Jan 10, 2025

neo773 Jan 10, 2025 •

edited

Loading

louis030195 commented Jan 10, 2025

neo773 commented Jan 10, 2025 •

edited

Loading

louis030195 commented Jan 10, 2025 •

edited

Loading

louis030195 commented Jan 11, 2025

louis030195 commented Jan 11, 2025

neo773 commented Jan 11, 2025 •

edited

Loading

louis030195 commented Jan 14, 2025

algora-pbc bot commented Jan 14, 2025

Add shortcut to toggle audio recording and REST API #1114

Add shortcut to toggle audio recording and REST API #1114

Conversation

neo773 commented Jan 10, 2025 • edited Loading

Summary by CodeRabbit

Release Notes

vercel bot commented Jan 10, 2025 • edited Loading

coderabbitai bot commented Jan 10, 2025 • edited Loading

Walkthrough

Changes

Assessment against linked issues

Possibly related PRs

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

louis030195 Jan 10, 2025

Choose a reason for hiding this comment

louis030195 Jan 10, 2025

Choose a reason for hiding this comment

neo773 Jan 10, 2025 • edited Loading

Choose a reason for hiding this comment

louis030195 commented Jan 10, 2025

neo773 commented Jan 10, 2025 • edited Loading

louis030195 commented Jan 10, 2025 • edited Loading

louis030195 commented Jan 11, 2025

louis030195 commented Jan 11, 2025

neo773 commented Jan 11, 2025 • edited Loading

louis030195 commented Jan 14, 2025

algora-pbc bot commented Jan 14, 2025

neo773 commented Jan 10, 2025 •

edited

Loading

vercel bot commented Jan 10, 2025 •

edited

Loading

coderabbitai bot commented Jan 10, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

neo773 Jan 10, 2025 •

edited

Loading

neo773 commented Jan 10, 2025 •

edited

Loading

louis030195 commented Jan 10, 2025 •

edited

Loading

neo773 commented Jan 11, 2025 •

edited

Loading