Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Fix audio file with cover image treated as video file #91

Open
wants to merge 8 commits into
base: 90-bug-audio-file-treated-as-a-video-file
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 38 additions & 10 deletions MediaProcessor/src/Engine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@
namespace MediaProcessor {

Engine::Engine(const std::filesystem::path& mediaPath)
: m_mediaPath(std::filesystem::absolute(mediaPath)) {}
: m_mediaPath(std::filesystem::absolute(mediaPath)),
hasAudioStream(false),
videoStreamCount(0) {}

bool Engine::processMedia() {
ConfigManager& configManager = ConfigManager::getInstance();
Expand Down Expand Up @@ -65,23 +67,49 @@ bool Engine::processVideo() {

MediaType Engine::getMediaType() const {
const std::string command =
"ffprobe -loglevel error -show_entries stream=codec_type "
"-of default=noprint_wrappers=1:nokey=1 \"" +
m_mediaPath.string() + "\"";
"ffprobe -loglevel error -show_entries stream -of json \"" + m_mediaPath.string() + "\"";

std::optional<std::string> output = Utils::runCommand(command, true);
if (!output || output->empty()) {
throw std::runtime_error("Failed to detect media type.");
}

std::string_view result = *output;
if (result.find("video") != std::string_view::npos) {
return MediaType::Video;
} else if (result.find("audio") != std::string_view::npos) {
nlohmann::json streamData = nlohmann::json::parse(*output);

for (const auto& stream : streamData["streams"]) {
if (hasValidVideoStream(stream)) {
return MediaType::Video;
} else if (hasValidAudioStream(stream)) {
hasAudioStream = true;
}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l.82 here the else if is redundant as we're returning for valid video above. It should be a separate if statement

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l.82 here the else if is redundant as we're returning for valid video above. It should be a separate if statement

How is it redundant? I don't understand. Can you please explain it?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two conditional checks have no logical coupling or dependency. Since we already return if a valid video stream is found, the code following that if statement will only execute if the condition was false. So, the else if is redundant, as the second check is independent and can simply be a separate if.


// If we haven't returned for Video and have found a valid audio stream
if (hasAudioStream) {
return MediaType::Audio;
} else {
throw std::runtime_error("Unsupported media type detected.");
}

// otherwise
return MediaType::Unsupported;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2 lines of comments here were more for providing clarity in the discussion, but typically we refrain to add comments that state otherwise obvious operations. So in this case, let's remove the comments here


bool Engine::hasValidVideoStream(const nlohmann::json& stream) const {
if (stream["codec_type"] == "video") {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also adding to my comment below, this should be an early return. If it's not a video stream to begin with, we should immediately return false.

++videoStreamCount;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incrementing this here (even after the refactor to make it a local var to getMediaType) would make it more difficult to understand the workflow. This should be done at the call side.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incrementing this here (even after the refactor to make it a local var to getMediaType) would make it more difficult to understand the workflow. This should be done at the call side.

Removing this variable. As we will return from getMediaType() as soon as we will get an actual video stream.(Video stream that is not cover image.)

In case when we encounter video stream with cover image, we will return false from here. In next iteration, when an actual video stream is computed, we will return true after checking avg_frame_rate of that stream. I do suggest to check for avg_frame_rate for that stream as well.

I don't think we should return just by video stream count being more than 1. We should check for avg_frame_rate. What do you think about this? let me know.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked the changes.

First off, we're now saying any stream that has the avg frame rate field will be treated as an invalid video stream. What if ffprobe returns sensible values for actual videos with that field? which wouldn't be surprising
Checking if the frame rate is exactly 0/0 and invalidating then, is logical. To determine if a video stream is valid, perhaps you can check non-zero duration in addition.

Also, I don't see the point of refactoring the function name to isActualVideoStream.

Regarding your question: As soon as we have even one single valid video stream, that would classify the media file as a video and we should return Video. There is no longer a need to check each of the remaining streams.
On the contrary, to determine if an audio file only has a cover attached, we have to ensure that media file has only 1 video stream whose avg frame rate is exactly 0/0.

I hope this answers your question, and please let me know otherwise.

Copy link
Author

@Rushi1109 Rushi1109 Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off, we're now saying any stream that has the avg frame rate field will be treated as an invalid video stream. What if ffprobe returns sensible values for actual videos with that field? which wouldn't be surprising
Checking if the frame rate is exactly 0/0 and invalidating then, is logical. To determine if a video stream is valid, perhaps you can check non-zero duration in addition.

The current implementation works as described below:

  1. It checks for the codec_type field. It should be video.
  2. It then ensures the avg_frame_rate field exists.
  3. And then we check if the avg_frame_rate is equal to 0/0 which indicates it is a static media.

In all the above cases we return false.

  1. Otherwise we return true as the stream is now actual video stream.

The thing with the duration field would be problematic, because when we checked earlier, the ffmpeg tool outputs the duration of static media same as audio duration. So, I don't think we should rely on duration field.

The current implementation look good to me. As there is early return when we encounter first ever valid video stream. (We terminate the for loop from that point)

But, we do iterate when the first stream is not valid video stream (static media, cover image).

Let me know if there's something I've misunderstood.


if (videoStreamCount > 1 || !hasZeroFrameRate(stream["avg_frame_rate"])) {
return true;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not cover the inexistence of the field, and may cause a crash. Please ensure the field is available and return false otherwise. Similarly, do a return false, instead of nesting under a conditional level

}
return false;
}

bool Engine::hasValidAudioStream(const nlohmann::json& stream) const {
return (stream["codec_type"] == "audio");
}

bool Engine::hasZeroFrameRate(const std::string& frameRate) const {
return (frameRate == "0/0");
}

} // namespace MediaProcessor
30 changes: 30 additions & 0 deletions MediaProcessor/src/Engine.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#define ENGINE_H

#include <filesystem>
#include <nlohmann/json.hpp>

namespace MediaProcessor {

Expand All @@ -26,6 +27,8 @@ class Engine {

private:
std::filesystem::path m_mediaPath;
mutable bool hasAudioStream;
mutable int videoStreamCount;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason to add these as member variables is not clear to me. They do not represent persistent state for the Engine class. Their meanings persist through a function call of getMediaType() and do not really make sense outside of that.

Although not critical here as we won't keep them, but for the future: before adding mutable-specified variables, we should take a good look at why we intend to that. It could point to a bad design choice that I'd rather address at source than to go the mutable route.

Please refactor these variables to be local to the respective function, where they're used.


/**
* @brief Processes an audio file.
Expand All @@ -49,6 +52,33 @@ class Engine {
* @throws std::runtime_error if detection fails.
*/
MediaType getMediaType() const;

/**
* @brief Checks if the provided stream is valid video stream
*
* @param stream json data of stream
*
* @return True if the stream is valid. False otherwise.
*/
bool hasValidVideoStream(const nlohmann::json& stream) const;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does valid mean here? We should describe what exactly that means here


/**
* @brief Checks if the provided stream is valid audio stream
*
* @param stream json data of stream
*
* @return True if the stream is valid. False otherwise.
*/
bool hasValidAudioStream(const nlohmann::json& stream) const;

/**
* @brief Checks if the stream's avg_frame_rate is 0/0
*
* @param frameRate the avg_frame_rate field from stream
*
* @return True if avg_frame_rate equals to 0/0. False otherwise.
*/
bool hasZeroFrameRate(const std::string& frameRate) const;
};

} // namespace MediaProcessor
Expand Down