Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shortcut to toggle audio recording and REST API #1114

Merged
merged 16 commits into from
Jan 14, 2025

Conversation

neo773
Copy link
Contributor

@neo773 neo773 commented Jan 10, 2025

/closes #1065
/claim #1065

a.mp4

Summary by CodeRabbit

Release Notes

  • New Features

    • Added global shortcuts for starting and stopping audio recording.
    • Introduced audio device management capabilities.
    • Enhanced server-side audio device control with new API endpoints for starting and stopping audio devices.
  • Improvements

    • Improved concurrency handling for audio devices using DashMap.
    • Added more robust error handling for audio-related operations.
    • Extended application settings with new configuration options, including audio shortcuts.
  • Technical Updates

    • Added dashmap dependency across multiple packages.
    • Updated server and client-side audio processing logic.
    • Enhanced logging and error visibility during audio device management.

Copy link

vercel bot commented Jan 10, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
screenpipe ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 11, 2025 8:19pm

Copy link
Contributor

coderabbitai bot commented Jan 10, 2025

Walkthrough

This pull request introduces comprehensive audio device management capabilities across the Screenpipe application. The changes span multiple components, including the Tauri app, server, and audio processing modules. Key additions include global shortcuts for starting and stopping audio recording, new API endpoints for audio device control, and a shift from SegQueue to DashMap for more robust concurrent device state management. The implementation enables users to control audio devices through both the UI and global keyboard shortcuts.

Changes

File Change Summary
Cargo.toml Added dashmap dependency to workspace
screenpipe-app-tauri/app/page.tsx Added audio device management event listeners
screenpipe-app-tauri/components/settings/shortcut-* Added audio recording shortcut configuration
screenpipe-app-tauri/lib/api/index.ts Added startAudio and stopAudio methods
screenpipe-app-tauri/lib/hooks/use-settings.tsx Updated Settings type with new audio-related fields
screenpipe-app-tauri/src-tauri/src/main.rs Enhanced shortcut handling for audio control
screenpipe-server/Cargo.toml Added dashmap dependency for server package
screenpipe-server/src/bin/screenpipe-server.rs Updated audio device management using DashMap
screenpipe-server/src/core.rs Updated audio device management using DashMap
screenpipe-server/src/server.rs Added REST API endpoints for audio device control

Assessment against linked issues

Objective Addressed Explanation
Implement API to stop/start audio devices (#1065)
Add shortcut to stop/start in app UI (#1065)

Possibly related PRs

Poem

🎙️ Hear ye, hear ye, a rabbit's delight!
Audio controls now shine so bright
Shortcuts dance, devices sing
With Screenpipe's new audio wing
Recording starts, then gently stops
Our furry friend's innovation never drops! 🐰🎧


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c6303b and bd01a3f.

📒 Files selected for processing (1)
  • screenpipe-server/src/bin/screenpipe-server.rs (6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: test-windows
  • GitHub Check: test-macos
  • GitHub Check: test-ubuntu
  • GitHub Check: test-linux
  • GitHub Check: test-windows
🔇 Additional comments (3)
screenpipe-server/src/bin/screenpipe-server.rs (3)

306-307: Good choice using DashMap for concurrent device state management!

The transition from SegQueue to DashMap is a solid improvement that provides:

  • Better concurrent access patterns with built-in synchronization
  • More intuitive key-value based device state management
  • Reduced complexity in state updates

470-478: Recording functionality properly integrates audio device management.

The continuous recording setup correctly incorporates:

  • Audio device control integration
  • Proper parameter handling
  • Consistent configuration management

Line range hint 539-598: Well-implemented device monitoring with proper error handling!

The implementation successfully addresses the previous review suggestions by:

  • Using a broadcast channel for device state updates
  • Implementing proper error handling in a separate handler function
  • Including appropriate logging for state changes and errors

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (9)
screenpipe-server/src/bin/screenpipe-server.rs (1)

368-368: Consider renaming sender_clone for clarity

The variable sender_clone is now a DashMap due to the change from SegQueue to DashMap. Renaming it to something like device_control_map would improve code readability and accurately reflect its purpose.

Apply this diff to rename the variable:

-let sender_clone = audio_devices_control.clone();
+let device_control_map = audio_devices_control.clone();
screenpipe-app-tauri/lib/api/index.ts (1)

33-33: Consistent use of double quotes in constructor

Changing the baseUrl default value to use double quotes ensures consistency with common TypeScript conventions.

Apply this diff for consistency:

-constructor(baseUrl: string = 'http://localhost:3030') {
+constructor(baseUrl: string = "http://localhost:3030") {
screenpipe-audio/src/bin/screenpipe-audio-forever.rs (1)

98-98: Consider enhancing error handling for the new parameter.

While adding the new optional parameter is correct, consider adding explicit error handling for cases where device control might be required but not provided.

-        None,
+        None.map(|ctrl| {
+            // Handle potential device control initialization errors
+            ctrl.map_err(|e| anyhow!("Failed to initialize device control: {}", e))
+        }).transpose()?,
screenpipe-app-tauri/app/page.tsx (2)

43-47: Consider caching audio devices to prevent frequent store access.

The getAudioDevices function queries the store on every invocation. Consider caching the devices or using a React ref to optimize performance.

+  const audioDevicesRef = React.useRef<string[]>([]);
   const getAudioDevices = async () => {
+    if (audioDevicesRef.current.length > 0) {
+      return audioDevicesRef.current;
+    }
     const store = await getStore();
     const devices = (await store.get("audioDevices")) as string[];
+    audioDevicesRef.current = devices;
     return devices;
   };

105-105: Remove or enhance console.log statement.

The console.log statement should either be removed or enhanced with additional context for debugging purposes.

-      console.log("audio-devices", devices);
+      console.debug("[Audio Shortcuts] Available devices:", devices);
screenpipe-audio/src/stt.rs (1)

207-216: Device state check implementation looks robust.

The implementation properly handles both active and missing device cases with appropriate debug logging. The early continue statements prevent unnecessary processing.

Consider adding metrics or structured logging to track skipped audio processing for better observability.

screenpipe-server/src/core.rs (1)

35-35: Great architectural improvement using DashMap!

The transition from SegQueue to DashMap enhances concurrent device state management:

  • Better support for multiple readers and writers
  • More efficient state lookups
  • Improved thread safety with built-in synchronization

Also applies to: 281-281

screenpipe-app-tauri/src-tauri/src/main.rs (1)

279-307: Consider extracting store updates to a reusable function.

The audio shortcut registration is well-implemented, but the store update logic could be refactored for better reusability:

+fn update_audio_state(app: &AppHandle, enabled: bool) -> Result<(), String> {
+    let store = get_store(app, None).map_err(|e| e.to_string())?;
+    store.set("disableAudio", !enabled);
+    store.save().map_err(|e| e.to_string())
+}

 register_shortcut(
     app,
     &config.start_audio,
     config.is_disabled("start_audio"),
     |app| {
-        let store = get_store(app, None).unwrap();
-        store.set("disableAudio", false);
-        store.save().unwrap();
+        let _ = update_audio_state(app, true);
         let _ = app.emit("shortcut-start-audio", ());
     },
 )
screenpipe-server/src/server.rs (1)

1624-1698: Consider extracting device validation to reduce duplication.

The handlers are well-implemented but contain duplicated device validation logic. Consider refactoring:

+async fn validate_audio_device(device: &AudioDevice) -> Result<(), (StatusCode, JsonResponse<Value>)> {
+    let available_devices = list_audio_devices().await.map_err(|e| {
+        (StatusCode::INTERNAL_SERVER_ERROR, 
+         JsonResponse(json!({
+             "error": format!("failed to list audio devices: {}", e),
+             "success": false
+         })))
+    })?;
+
+    if !available_devices.contains(device) {
+        return Err((StatusCode::BAD_REQUEST, 
+            JsonResponse(json!({
+                "error": format!("device not found: {}", device.name),
+                "success": false
+            }))));
+    }
+    Ok(())
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67f09c9 and 6fe4f1f.

📒 Files selected for processing (18)
  • Cargo.toml (1 hunks)
  • screenpipe-app-tauri/app/page.tsx (3 hunks)
  • screenpipe-app-tauri/components/settings/shortcut-row.tsx (3 hunks)
  • screenpipe-app-tauri/components/settings/shortcut-section.tsx (1 hunks)
  • screenpipe-app-tauri/lib/api/index.ts (2 hunks)
  • screenpipe-app-tauri/lib/hooks/use-settings.tsx (3 hunks)
  • screenpipe-app-tauri/src-tauri/src/main.rs (4 hunks)
  • screenpipe-audio/Cargo.toml (1 hunks)
  • screenpipe-audio/benches/record_and_transcribe_benchmark.rs (1 hunks)
  • screenpipe-audio/src/bin/screenpipe-audio-forever.rs (1 hunks)
  • screenpipe-audio/src/bin/screenpipe-audio.rs (1 hunks)
  • screenpipe-audio/src/core.rs (1 hunks)
  • screenpipe-audio/src/stt.rs (4 hunks)
  • screenpipe-audio/tests/core_tests.rs (1 hunks)
  • screenpipe-server/Cargo.toml (1 hunks)
  • screenpipe-server/src/bin/screenpipe-server.rs (7 hunks)
  • screenpipe-server/src/core.rs (6 hunks)
  • screenpipe-server/src/server.rs (12 hunks)
🔇 Additional comments (26)
screenpipe-server/src/bin/screenpipe-server.rs (7)

39-39: DashMap imported for concurrent device control management

The addition of use dashmap::DashMap; correctly incorporates DashMap for thread-safe, concurrent access to audio device control states.


306-307: Switch from SegQueue to DashMap for device control

Replacing SegQueue with DashMap allows for concurrent reads and writes to the audio device control map without explicit locking, enhancing performance and safety.


464-466: Cloning variables for scoped usage is appropriate

Cloning vad_engine and realtime_transcription_sender ensures thread-safe access within the asynchronous task.


471-474: Passing cloned audio_devices_control_recording to recording function

Passing the cloned audio_devices_control_recording to start_continuous_recording aligns with the updated concurrency model.


493-493: Including realtime_transcription_sender_clone in parameters

Passing Arc::new(realtime_transcription_sender_clone) ensures that the real-time transcription sender is correctly shared across tasks.


533-535: Initialize broadcast channel for audio device updates

The introduction of audio_devices_tx and cloning it for use in different parts of the application facilitates communication of audio device state changes.


552-572: Handle dynamic audio device state changes in spawned task

The asynchronous task correctly listens to the broadcast channel and updates audio_devices_control_for_spawn accordingly. Error handling for unavailable devices and logging enhances reliability.

screenpipe-audio/benches/record_and_transcribe_benchmark.rs (1)

28-28: Update create_whisper_channel call with new optional parameter

Adding None as the new argument aligns with the updated signature of create_whisper_channel, ensuring the function is called correctly with all required parameters.

screenpipe-app-tauri/lib/api/index.ts (4)

45-45: Properly handle API response success flags

Throwing an error when data.success is false in listPipes method ensures that failures are correctly propagated to the caller.


49-52: Log and rethrow errors in listPipes method

Adding error logging and rethrowing in the catch block aids in debugging and ensures that errors aren't silently swallowed.


54-80: Implement startAudio method with robust error handling

The new startAudio method correctly sends a POST request to the server and includes comprehensive error handling for both network and API-level errors.


81-104: 🛠️ Refactor suggestion

⚠️ Potential issue

Improve device type detection in stopAudio method

The current logic for determining the device type may fail if the device name does not include "(input)" or "(output)", or if these strings appear in the device name unintentionally. This could lead to incorrect device control commands being sent to the server.

Apply this diff to enhance the reliability of device type determination:

-const type = deviceName.includes("(input)") ? "Input" : "Output";
-const name = deviceName.replaceAll("(input)", "").replaceAll("(output)", "").trim();
+// Assume deviceName is an object with properties 'name' and 'type'
+// Pass deviceName as { name: string, type: 'Input' | 'Output' }
+const { name, type } = deviceName;

Alternatively, ensure that the device type is provided explicitly when calling stopAudio, and adjust the method signature accordingly.

Likely invalid or redundant comment.

screenpipe-app-tauri/components/settings/shortcut-section.tsx (1)

54-69: LGTM! Audio shortcut components follow established patterns.

The new ShortcutRow components for audio controls are well-integrated and maintain consistency with existing shortcut implementations.

screenpipe-audio/src/bin/screenpipe-audio.rs (1)

102-102: LGTM! Consistent implementation with robust error handling.

The None parameter addition aligns with other files, and the existing timeout handling provides good reliability.

screenpipe-app-tauri/components/settings/shortcut-row.tsx (2)

92-93: LGTM! Audio shortcut integration looks good.

The audio shortcut fields are properly integrated into the shortcut synchronization logic, maintaining consistency with the existing shortcut system.

Also applies to: 101-102, 112-113


179-179: Good addition of error logging.

Error logging enhancement improves debugging capabilities for shortcut-related issues.

screenpipe-app-tauri/lib/hooks/use-settings.tsx (1)

90-91: Audio shortcut fields properly typed and initialized.

The new audio shortcut fields are correctly added to the Settings type and initialized with empty strings in DEFAULT_SETTINGS.

Also applies to: 147-148

screenpipe-audio/src/stt.rs (1)

159-159: Verify thread safety of DashMap usage.

The addition of Arc<DashMap<AudioDevice, DeviceControl>> is a good choice for concurrent device state management. However, let's verify there are no potential deadlocks or race conditions in the codebase.

Run this script to check for potential race conditions:

✅ Verification successful

DashMap usage is thread-safe and well-implemented

The implementation follows concurrent programming best practices:

  • Thread-safe sharing through Arc wrapper
  • Atomic operations using DashMap's internal synchronization
  • Clean state management via broadcast channels
  • No potential deadlocks or race conditions detected
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential race conditions in device control access
# Look for places where DashMap is accessed without proper synchronization

# Search for DashMap usage patterns
rg -A 5 "DashMap|DeviceControl" 

# Search for potential race conditions in async contexts
ast-grep --pattern 'async fn $name($params) {
  $$$
  $map.insert($key, $value)
  $$$
  await
  $$$
  $map.remove($key)
  $$$
}'

Length of output: 14091

screenpipe-audio/src/core.rs (1)

Line range hint 53-57: LGTM! Good addition of the Debug trait.

Adding the Debug trait to DeviceControl struct enhances debugging capabilities, making it easier to inspect device states during development and troubleshooting.

screenpipe-server/src/core.rs (1)

295-304: Well-implemented device state management!

The changes effectively handle device lifecycle:

  • Proper iteration over active devices
  • Skip already handled devices to prevent duplicates
  • Clean removal of stopped devices

Also applies to: 314-315

screenpipe-app-tauri/src-tauri/src/main.rs (2)

79-80: LGTM! Good extension of shortcut configuration.

The new audio shortcut fields maintain consistency with existing shortcut patterns.


143-150: Well-implemented shortcut initialization!

The initialization follows the established pattern and properly handles defaults.

screenpipe-server/src/server.rs (2)

1610-1621: LGTM! Well-designed request/response structs.

The structs are well-designed with:

  • Proper serialization attributes
  • Optional fields with defaults
  • Clear and meaningful field names

1742-1743: LGTM! Well-structured API routes.

The new audio control routes follow REST conventions and maintain consistency with existing endpoints.

Cargo.toml (1)

39-39: LGTM! Verify version compatibility.

The addition of dashmap as a workspace dependency is appropriate for concurrent device state management. Version 6.1.0 is stable and well-tested.

Run this script to check for any newer versions with important updates:

✅ Verification successful

Version 6.1.0 is optimal ✓

The PR uses the latest stable version of dashmap, which is not yanked and actively maintained.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check latest dashmap versions and changelogs
curl -s https://crates.io/api/v1/crates/dashmap | jq -r '.versions[0] | {num:.num, yanked:.yanked}'

Length of output: 141

screenpipe-audio/Cargo.toml (1)

68-68: LGTM! Consistent workspace dependency.

The addition of dashmap using workspace inheritance ensures version consistency across the project.

screenpipe-app-tauri/app/page.tsx Show resolved Hide resolved
screenpipe-app-tauri/app/page.tsx Outdated Show resolved Hide resolved
screenpipe-audio/tests/core_tests.rs Show resolved Hide resolved
screenpipe-server/Cargo.toml Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
screenpipe-app-tauri/lib/api/index.ts (1)

83-110: Reduce code duplication in audio control methods.

The stopAudio method duplicates most of the logic from startAudio. Consider extracting the common functionality into a private helper method.

Here's a suggested refactor:

+ private async controlAudio(deviceName: string, action: "start" | "stop"): Promise<void> {
+   if (!deviceName?.trim()) {
+     throw new Error("Device name cannot be empty");
+   }
+   if (!deviceName.includes("(input)") && !deviceName.includes("(output)")) {
+     throw new Error("Invalid device name format. Must include '(input)' or '(output)'");
+   }
+   try {
+     const type = deviceName.includes("(input)") ? "Input" : "Output";
+     const name = deviceName.replaceAll("(input)", "").replaceAll("(output)", "").trim();
+     const response = await fetch(`${this.baseUrl}/audio/${action}`, {
+       method: "POST",
+       headers: {
+         "Content-Type": "application/json",
+       },
+       body: JSON.stringify({
+         device_name: name,
+         device_type: type,
+       }),
+     });
+
+     if (!response.ok) {
+       throw new Error(`failed to ${action} audio: ${response.statusText}`);
+     }
+
+     const data = await response.json();
+     if (!data.success) {
+       throw new Error(`failed to ${action} audio: ${data.message}`);
+     }
+   } catch (error) {
+     console.error(`error ${action}ing audio:`, error);
+     throw error;
+   }
+ }
+
+ async startAudio(deviceName: string): Promise<void> {
+   return this.controlAudio(deviceName, "start");
+ }
+
+ async stopAudio(deviceName: string): Promise<void> {
+   return this.controlAudio(deviceName, "stop");
+ }
screenpipe-server/src/bin/screenpipe-server.rs (2)

368-368: Consider parameterizing the sleep duration before inserting into sender_clone.

The hardcoded 15-second delay might benefit from being configurable or documented to clarify its purpose, improving maintainability.


558-578: Handle potential errors from rx.recv().await to ensure task reliability.

If all senders are dropped, rx.recv().await will return Err(RecvError::Closed), causing the loop to exit. Consider handling the error to maintain the task or explicitly break if intended.

screenpipe-server/src/core.rs (1)

20-20: Remove unused import: HashSet.

The HashSet import is unused and can be safely removed to clean up the codebase.

Apply this diff to remove the unused import:

-use std::collections::{HashMap, HashSet};
+use std::collections::HashMap;
🧰 Tools
🪛 GitHub Check: test-ubuntu

[warning] 20-20:
unused import: HashSet

🪛 GitHub Check: test-linux

[warning] 20-20:
unused import: HashSet

🪛 GitHub Check: test-macos

[warning] 20-20:
unused import: HashSet

screenpipe-server/src/server.rs (4)

1630-1667: Handle the result of state.audio_devices_tx.send in start_audio_device.

The send method can return an error if there are no active subscribers or if the channel is full. Consider handling or logging the error to ensure that control messages are sent successfully.

Apply this diff to handle the potential error:

-        let _ = state.audio_devices_tx.send((device.clone(), control));
+        if let Err(e) = state.audio_devices_tx.send((device.clone(), control)) {
+            error!("Failed to send control message: {}", e);
+        }

1670-1704: Handle the result of state.audio_devices_tx.send in stop_audio_device.

Similar to start_audio_device, handle or log any errors returned by the send method to ensure reliability.

Apply this diff to handle the potential error:

-        let _ = state.audio_devices_tx.send((device.clone(), DeviceControl {
+        if let Err(e) = state.audio_devices_tx.send((device.clone(), DeviceControl {
             is_running: false,
             is_paused: false,
-        }));
+        })) {
+            error!("Failed to send control message: {}", e);
+        }

1630-1667: Consider adding unit tests for start_audio_device handler.

Adding unit tests ensures the new API endpoint behaves as expected and helps prevent future regressions.

Would you like assistance in creating unit tests for this handler?


1670-1704: Consider adding unit tests for stop_audio_device handler.

Unit tests will help verify the correctness of the stop_audio_device function and enhance code reliability.

Do you want help in writing unit tests for this handler?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fe4f1f and fd4744b.

📒 Files selected for processing (5)
  • Cargo.toml (1 hunks)
  • screenpipe-app-tauri/lib/api/index.ts (2 hunks)
  • screenpipe-server/src/bin/screenpipe-server.rs (6 hunks)
  • screenpipe-server/src/core.rs (6 hunks)
  • screenpipe-server/src/server.rs (14 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • Cargo.toml
🧰 Additional context used
🪛 GitHub Check: test-ubuntu
screenpipe-server/src/core.rs

[warning] 20-20:
unused import: HashSet

🪛 GitHub Check: test-linux
screenpipe-server/src/core.rs

[warning] 20-20:
unused import: HashSet

🪛 GitHub Check: test-macos
screenpipe-server/src/core.rs

[warning] 20-20:
unused import: HashSet

🪛 GitHub Actions: Linux Integration Test
screenpipe-server/src/server.rs

[error] 35-35: The name list_monitors is defined multiple times

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: test-windows
  • GitHub Check: test-windows
🔇 Additional comments (18)
screenpipe-app-tauri/lib/api/index.ts (2)

45-45: LGTM! Enhanced error handling.

The improved error handling with detailed logging will help with debugging issues.

Also applies to: 49-51


54-81: 🛠️ Refactor suggestion

Enhance robustness and documentation of audio device handling.

Several improvements could make this method more robust and maintainable:

  1. The device type determination is fragile and relies on string patterns.
  2. The method lacks input validation and documentation.

Consider these improvements:

+ /** 
+  * Starts audio recording/playback for the specified device.
+  * @param deviceName - The name of the audio device (must include "(input)" or "(output)")
+  * @throws {Error} If the device name is invalid or the API request fails
+  */
 async startAudio(deviceName: string): Promise<void> {
+  if (!deviceName?.trim()) {
+    throw new Error("Device name cannot be empty");
+  }
+  if (!deviceName.includes("(input)") && !deviceName.includes("(output)")) {
+    throw new Error("Invalid device name format. Must include '(input)' or '(output)'");
+  }
   try {
     const type = deviceName.includes("(input)") ? "Input" : "Output";
     const name = deviceName.replaceAll("(input)", "").replaceAll("(output)", "").trim();

Let's verify the device name format assumptions across the codebase:

✅ Verification successful

Input validation for audio device names is essential and aligns with codebase patterns.

The review comment's concerns about device name format validation are valid. The codebase consistently uses "(input)" and "(output)" suffixes for device names across multiple components, making the suggested input validation and documentation crucial for maintaining this convention.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for patterns of device name usage to verify the format assumptions
rg -A 2 "includes\(['\"].*input|output.*['\"]\)" 

Length of output: 15965

screenpipe-server/src/bin/screenpipe-server.rs (4)

39-39: LGTM!

The import of dashmap::DashMap is necessary for the updated audio device control implementation.


306-307: LGTM!

Initialization of audio_devices_control with DashMap enhances concurrent access and is correctly cloned for usage in different contexts.


470-478: LGTM!

The parameters for start_continuous_recording have been correctly updated to align with the new DashMap implementation for audio device control.


539-541: LGTM!

Creation of the audio_devices_tx broadcast channel and its cloning are properly implemented for transmitting audio device control messages.

screenpipe-server/src/core.rs (5)

6-6: LGTM!

Importing dashmap::DashMap is necessary for the new concurrent audio device control mechanism.


36-36: LGTM!

The function signature of start_continuous_recording has been appropriately updated to use DashMap for audio device control.


124-124: LGTM!

Passing audio_devices_control to create_whisper_channel ensures consistent audio device management across components.


297-297: LGTM!

The updated record_audio function signature correctly reflects the new DashMap usage.


311-331: LGTM!

Iterating over audio_devices_control using DashMap entries is appropriate for concurrent device state management.

screenpipe-server/src/server.rs (7)

50-50: LGTM!

Importing tokio::sync::broadcast is necessary for implementing the audio device control broadcast channel.


65-65: LGTM!

Updating AppState with audio_devices_tx aligns with the new broadcast mechanism for audio device control.


804-804: LGTM!

The Server struct's inclusion of audio_devices_tx is consistent with the updated architecture.


821-821: LGTM!

Constructor parameters for Server::new have been appropriately updated.


835-835: LGTM!

Initialization of Server with the new audio_devices_tx aligns with the broadcast-based design.


859-859: LGTM!

Passing audio_devices_tx into AppState ensures that audio device control messages are managed correctly.


1615-1622: LGTM!

The AudioDeviceControlRequest struct is properly defined to handle audio device control requests.

screenpipe-server/src/server.rs Outdated Show resolved Hide resolved
neo773 and others added 3 commits January 10, 2025 06:25
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
screenpipe-server/src/bin/screenpipe-server.rs (2)

306-307: Consider adding a capacity hint to DashMap initialization.

Since the number of audio devices is known in advance, you can optimize the DashMap initialization by providing an initial capacity hint to avoid reallocation.

-    let audio_devices_control = Arc::new(DashMap::new());
+    let audio_devices_control = Arc::new(DashMap::with_capacity(all_audio_devices.len()));

539-541: Consider the broadcast channel capacity.

The broadcast channel capacity of 100 might be insufficient for high-frequency device state changes, potentially leading to message drops. Consider:

  1. Making the capacity configurable
  2. Implementing a backpressure mechanism
  3. Logging when messages are dropped
screenpipe-server/src/server.rs (1)

1614-1626: Add documentation and validation for API structs.

The request/response structs would benefit from:

  1. Documentation comments explaining their purpose and usage
  2. Validation attributes for the device_name field
  3. Example usage in doc comments
+/// Request payload for controlling audio devices.
+/// 
+/// # Example
+/// ```
+/// let request = AudioDeviceControlRequest {
+///     device_name: "Microphone".to_string(),
+///     device_type: Some(DeviceType::Input),
+/// };
+/// ```
 #[derive(Deserialize)]
 pub struct AudioDeviceControlRequest {
+    #[serde(validate(length(min = 1)))]
     device_name: String,
     #[serde(default)]
     device_type: Option<DeviceType>,
 }

+/// Response payload for audio device control operations.
 #[derive(Serialize)]
 pub struct AudioDeviceControlResponse {
     success: bool,
     message: String,
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd4744b and 7cefbdd.

⛔ Files ignored due to path filters (1)
  • screenpipe-app-tauri/src-tauri/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • screenpipe-server/src/bin/screenpipe-server.rs (6 hunks)
  • screenpipe-server/src/server.rs (13 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: test-windows
  • GitHub Check: test-ubuntu
  • GitHub Check: test-windows
  • GitHub Check: test-macos
  • GitHub Check: test-linux
🔇 Additional comments (2)
screenpipe-server/src/bin/screenpipe-server.rs (2)

39-40: LGTM!

The import of dashmap::DashMap is correctly placed and aligns with the architectural change from SegQueue to DashMap for concurrent device state management.


368-368: Review the 15-second delay before device control insertion.

The arbitrary 15-second delay before inserting the device control state could cause issues:

  1. Users might attempt to control devices during this delay window
  2. The delay doesn't account for device initialization time variations
  3. No error handling if the device isn't ready after the delay

Consider either removing the delay or implementing a proper device readiness check.

screenpipe-server/src/bin/screenpipe-server.rs Outdated Show resolved Hide resolved
screenpipe-server/src/server.rs Show resolved Hide resolved
screenpipe-server/src/server.rs Show resolved Hide resolved
screenpipe-server/src/server.rs Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
screenpipe-app-tauri/app/page.tsx (1)

126-145: ⚠️ Potential issue

Add consistent error handling and debouncing to stop operation.

The stop operation needs the same safeguards as the start operation:

  1. Missing debouncing with isProcessingRef
  2. Inconsistent Promise handling (using forEach instead of Promise.all)
  3. No cleanup in finally block
 listen("shortcut-stop-audio", async (event) => {
+  if (isProcessingRef.current) return;
+  isProcessingRef.current = true;
   try {
     const devices = await getAudioDevices();
     const pipeApi = new PipeApi();
-    devices.forEach((device) => {
-      pipeApi.stopAudio(device);
-    });
+    await Promise.all(devices.map(device => pipeApi.stopAudio(device)));
     toast({
       title: "audio stopped",
       description: "audio has been stopped",
     });
   } catch (error) {
     console.error("error stopping audio:", error);
     toast({
       title: "error stopping audio",
       description: error instanceof Error ? error.message : "unknown error occurred",
       variant: "destructive",
     });
+  } finally {
+    isProcessingRef.current = false;
   }
 }),
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7cefbdd and 26ce6a2.

📒 Files selected for processing (5)
  • screenpipe-app-tauri/app/page.tsx (3 hunks)
  • screenpipe-server/src/core.rs (6 hunks)
  • screenpipe-server/src/server.rs (13 hunks)
  • screenpipe-server/tests/endpoint_test.rs (1 hunks)
  • screenpipe-server/tests/tags_test.rs (1 hunks)
🧰 Additional context used
🪛 Biome (1.9.4)
screenpipe-app-tauri/app/page.tsx

[error] 101-101: Expected a property, an expression, or a method but instead found 'const isProcessingRef = React.useRef'.

Expected a property, an expression, or a method here.

(parse)


[error] 101-101: expected , but instead found (

Remove (

(parse)


[error] 101-101: expected , but instead found ;

Remove ;

(parse)


[error] 102-102: expected , but instead found (

Remove (

(parse)

⏰ Context from checks skipped due to timeout of 90000ms (5)
  • GitHub Check: test-macos
  • GitHub Check: test-ubuntu
  • GitHub Check: test-windows
  • GitHub Check: test-linux
  • GitHub Check: test-windows
🔇 Additional comments (12)
screenpipe-app-tauri/app/page.tsx (4)

4-4: LGTM!

The getStore import is appropriately placed alongside the related useSettings import.


87-99: LGTM!

The formatting changes improve code readability while maintaining the same functionality.


101-124: LGTM on the start audio implementation.

The implementation includes proper debouncing and error handling with the isProcessingRef.

🧰 Tools
🪛 Biome (1.9.4)

[error] 101-101: Expected a property, an expression, or a method but instead found 'const isProcessingRef = React.useRef'.

Expected a property, an expression, or a method here.

(parse)


[error] 101-101: expected , but instead found (

Remove (

(parse)


[error] 101-101: expected , but instead found ;

Remove ;

(parse)


[error] 102-102: expected , but instead found (

Remove (

(parse)


101-145: Consider extracting shared audio operation logic.

Both start and stop operations share similar patterns. Consider extracting the common logic into a shared function to improve maintainability.

+const handleAudioOperation = async (
+  operation: 'start' | 'stop',
+  handler: (device: string) => Promise<void>
+) => {
+  if (isProcessingRef.current) return;
+  isProcessingRef.current = true;
+  try {
+    const devices = await getAudioDevices();
+    const pipeApi = new PipeApi();
+    await Promise.all(devices.map(handler));
+    toast({
+      title: `audio ${operation}ed`,
+      description: `audio has been ${operation}ed`,
+    });
+  } catch (error) {
+    console.error(`error ${operation}ing audio:`, error);
+    toast({
+      title: `error ${operation}ing audio`,
+      description: error instanceof Error ? error.message : "unknown error occurred",
+      variant: "destructive",
+    });
+  } finally {
+    isProcessingRef.current = false;
+  }
+};

-listen("shortcut-start-audio", async () => { ... }),
-listen("shortcut-stop-audio", async () => { ... }),
+listen("shortcut-start-audio", () => 
+  handleAudioOperation('start', device => pipeApi.startAudio(device))
+),
+listen("shortcut-stop-audio", () => 
+  handleAudioOperation('stop', device => pipeApi.stopAudio(device))
+),
🧰 Tools
🪛 Biome (1.9.4)

[error] 101-101: Expected a property, an expression, or a method but instead found 'const isProcessingRef = React.useRef'.

Expected a property, an expression, or a method here.

(parse)


[error] 101-101: expected , but instead found (

Remove (

(parse)


[error] 101-101: expected , but instead found ;

Remove ;

(parse)


[error] 102-102: expected , but instead found (

Remove (

(parse)

screenpipe-server/tests/tags_test.rs (1)

34-34: LGTM! Channel size is appropriate for testing.

The transition to broadcast channel with a buffer size of 1000 is suitable for test scenarios.

screenpipe-server/tests/endpoint_test.rs (1)

37-37: LGTM! Consistent with tags_test.rs implementation.

The broadcast channel initialization maintains consistency across test files.

screenpipe-server/src/core.rs (3)

5-5: LGTM! DashMap is appropriate for concurrent device state management.

Using DashMap provides better concurrent access patterns and state management capabilities compared to SegQueue.


35-35: LGTM! Parameter type update aligns with architectural changes.

The transition to DashMap for audio_devices_control enhances concurrent device state management.


123-123: LGTM! Proper integration with whisper channel.

The whisper channel correctly receives the audio_devices_control reference.

screenpipe-server/src/server.rs (3)

1614-1625: LGTM! Well-structured request/response types.

The types are properly defined with appropriate optional fields and serialization attributes.


1796-1797: 🛠️ Refactor suggestion

Add rate limiting and authentication to audio endpoints.

The audio control endpoints need protection against abuse:

  1. Rate limiting to prevent DoS attacks
  2. Authentication to restrict access
  3. Request validation middleware
-        .route("/audio/start", post(start_audio_device))
-        .route("/audio/stop", post(stop_audio_device))
+        .route("/audio/start", post(start_audio_device))
+            .layer(RateLimitLayer::new(10, Duration::from_secs(60)))
+            .layer(AuthenticationLayer::new())
+        .route("/audio/stop", post(stop_audio_device))
+            .layer(RateLimitLayer::new(10, Duration::from_secs(60)))
+            .layer(AuthenticationLayer::new())

Likely invalid or redundant comment.


1627-1665: 🛠️ Refactor suggestion

Improve error handling and state validation.

The start handler needs improvements:

  1. Validate current device state before starting
  2. Handle broadcast send failures
  3. Update devices_status map
 async fn start_audio_device(
     State(state): State<Arc<AppState>>,
     Json(payload): Json<AudioDeviceControlRequest>,
 ) -> Result<JsonResponse<AudioDeviceControlResponse>, (StatusCode, JsonResponse<Value>)> {
     let device = AudioDevice {
         name: payload.device_name.clone(),
         device_type: payload.device_type.unwrap_or(DeviceType::Input),
     };

+    // Check if device is already running
+    if let Some(current_state) = state.devices_status.get(&device) {
+        if current_state.is_running {
+            return Ok(JsonResponse(AudioDeviceControlResponse {
+                success: true,
+                message: format!("device already running: {}", device.name),
+            }));
+        }
+    }

     let control = DeviceControl { 
         is_running: true, 
         is_paused: false 
     };
     
-    let _ = state.audio_devices_tx.send((device.clone(), control));
+    if let Err(e) = state.audio_devices_tx.send((device.clone(), control.clone())) {
+        error!("Failed to send device control message: {}", e);
+        return Err((StatusCode::INTERNAL_SERVER_ERROR,
+            JsonResponse(json!({
+                "error": "failed to start device",
+                "success": false
+            }))));
+    }
+
+    // Update devices_status
+    state.devices_status.insert(device.clone(), control);

Likely invalid or redundant comment.

screenpipe-app-tauri/app/page.tsx Show resolved Hide resolved
screenpipe-server/src/core.rs Show resolved Hide resolved
@neo773 neo773 changed the title New shortcut audio Add shortcut to toggle audio recording and REST API Jan 10, 2025
@@ -155,6 +156,7 @@ pub async fn create_whisper_channel(
output_path: &PathBuf,
vad_sensitivity: VadSensitivity,
languages: Vec<Language>,
audio_devices_control: Option<Arc<DashMap<AudioDevice, DeviceControl>>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is audio device control has to change speech to text code instead of audio device recording

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also does it work well with the audio streaming api

if i stop a device

Copy link
Contributor Author

@neo773 neo773 Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, start_realtime_recording() gets checked by parent block if device is running.

@louis030195
Copy link
Collaborator

will test this soon, questions:

  • does it allow to start/stop specific device? (it's ok if it's only in API for now and not in app UI)
  • how does it affects status? if i stop all audio device does it turn red?

@neo773
Copy link
Contributor Author

neo773 commented Jan 10, 2025

  • does it allow to start/stop specific device? (it's ok if it's only in API for now and not in app UI)

Yeah, it's granular you can stop/start specific devices.

  • how does it affects status? if i stop all audio device does it turn red?

No, current logic checks for both audio and video I think for it to turn red.

@louis030195
Copy link
Collaborator

louis030195 commented Jan 10, 2025

Screen.Recording.2025-01-10.at.10.16.39.AM.mov

issues i noticed, not sure if related to this PR:

  1. screenpipe started at app boot despite i had dev mode on
  2. stop button does not work anymore
  3. when assigning shortcut it should disable all shortcuts while we are assigning (this is UX problem, unrelated to this PR, not big deal for now)

@louis030195
Copy link
Collaborator

testing agai now

@louis030195
Copy link
Collaborator

seem to work\

one issue which seem only dev i think is stop button does not work sometimes, when starting the app with bun tauri dev

would it be possible to add in the status some kind of green/red or thing saying which devices are being recorded (monitor and audio deviec)?

@neo773
Copy link
Contributor Author

neo773 commented Jan 11, 2025

seem to work\

one issue which seem only dev i think is stop button does not work sometimes, when starting the app with bun tauri dev
,
would it be possible to add in the status some kind of green/red or thing saying which devices are being recorded (monitor and audio deviec)?

I think it should be doable, the entire modal needs a redesigned UI/UX imo, current one looks very MVPish/Alpha.

As for stop button I'm not sure, the code diff doesn't really delve into that parts. it's happened to me before too (before this PR).

It's also hard to reproduce

@louis030195
Copy link
Collaborator

/approve

Copy link

algora-pbc bot commented Jan 14, 2025

@louis030195: The claim has been successfully added to reward-all. You can visit your dashboard to complete the payment.

@louis030195 louis030195 merged commit d193a5a into mediar-ai:main Jan 14, 2025
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bounty] add shortcut to stop start audio recording
2 participants