-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Fix audio file with cover image treated as video file #91
base: 90-bug-audio-file-treated-as-a-video-file
Are you sure you want to change the base?
Conversation
Hey @Rushi1109, thanks for the PR. Frankly, I wasn't familiar with the full output set of Thanks for checking! |
Hey @omeryusufyagci, When looking at the raw data of the media I see But, When I tried to get the information about |
Moreover @omeryusufyagci, I have researched a bit and got to know that If the video stream contains a static image, the average frame rate for that stream will be Do you see any issue with this approach? |
Please make sure you include Could you try like that, and if there's still an issue please provide the command you used and the output you received.
Thanks for figuring this out! Good to know that'd be a valid fix. The Could you please verify this and let me know if it resolves the issue? Thank you! |
Yes I included Can you check it on your side? Do you see disposition in output? |
HI @Rushi1109, you're right. I was able to reproduce the same behavior. It depends on what's bundled with the metadata, and we won't be able to count on the existence of that field. That means we'll have to check for this ourselves. You're also correct, that a cover image is reported as a video stream that has In this case, I propose to get the same ffprobe output as json (fyi, we're already using nlohmann/json in the project), parse the avg frame rate into a vector, and check if a given file has only 1 video stream and that video stream has 0/0 avg frame rate. Meaning, as soon as we iterate through a second video stream, we should conclude our check to avoid wasting time there. Could you update this PR to implement this? Thanks |
Sure. I'll look into that today. |
Hello @omeryusufyagci, I have made required changes. I have tested it manually and it is working as expected. Can you test it on your side and let me know if any change is required? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes @Rushi1109 !
Here's some comments for you:
About the parseFrameRate()
addition:
The MediaProcessor::Utils
NS is normally meant for generic utility. In this case, the frame rate helper is intrinsically coupled with getMediaType()
. So, we should move this under the Engine class as a private member function. Also, for expressiveness, let's rename it to hasZeroFrameRate()
.
Regarding the implementation of this helper, I believe we can make it simpler. We want to know if an otherwise audio file has a video stream that has exactly "0/0" for the avg frame rate. So, we can simply and do a full string comparison, instead of parsing each char in the field. Pass a const ref of the stream object, and it can be as simple as a 1-line return statement that checks if it matches the expected case. We should however, make sure to cover the cases where the field is missing or invalid.
Updates on getMediaType()
:
Good start on this with the json parsing, but we can also simplify this similar to above. We should also encapsulate the logic where we're determining if we have a valid video stream. Otherwise, it makes it needlessly difficult understand what this function does at a glance.
Here's roughly what I have in mind:
for (const auto& stream : streams) {
if (hasValidVideoStream(stream)) {
return MediaType::Video;
}
if (hasValidAudioStream(stream)) {
hasAudioStream = true; // define this flag false before the loop
}
}
// if we haven't returned for Video and have found a valid audio stream
if (hasAudioStream) {
return MediaType::Audio;
}
// otherwise
return MediaType::Unsupported;
}
So please encapsulate both checks (hasValid...
) into individual functions. I realize the one for audio will likely be a 1-line return statement, but I'd still prefer to have it for the sake of consistency.
Within the hasValidVideoStream()
, you'd call hasZeroFrameRate()
with a const ref of the json stream for the avg frame rate field.
The other 2 helpers should also be called with const refs of the stream and all 3 new functions should be const themselves too.
This approach should make it easier to understand what getMediaType()
does, and in general simplify the deduction logic to see if we have a cover image. Could you please update this PR accordingly?
Please let me know if you have any questions along the way. Many thanks!
Updated the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Rushi1109, thanks for the updates. Definitely moving in the right direction! Could you please take a look and update the PR? Thanks!
MediaProcessor/src/Engine.h
Outdated
mutable bool hasAudioStream; | ||
mutable int videoStreamCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason to add these as member variables is not clear to me. They do not represent persistent state for the Engine class. Their meanings persist through a function call of getMediaType()
and do not really make sense outside of that.
Although not critical here as we won't keep them, but for the future: before adding mutable-specified variables, we should take a good look at why we intend to that. It could point to a bad design choice that I'd rather address at source than to go the mutable
route.
Please refactor these variables to be local to the respective function, where they're used.
MediaProcessor/src/Engine.cpp
Outdated
// If we haven't returned for Video and have found a valid audio stream | ||
if (hasAudioStream) { | ||
return MediaType::Audio; | ||
} else { | ||
throw std::runtime_error("Unsupported media type detected."); | ||
} | ||
|
||
// otherwise | ||
return MediaType::Unsupported; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 2 lines of comments here were more for providing clarity in the discussion, but typically we refrain to add comments that state otherwise obvious operations. So in this case, let's remove the comments here
MediaProcessor/src/Engine.cpp
Outdated
for (const auto& stream : streamData["streams"]) { | ||
if (hasValidVideoStream(stream)) { | ||
return MediaType::Video; | ||
} else if (hasValidAudioStream(stream)) { | ||
hasAudioStream = true; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l.82 here the else if is redundant as we're returning for valid video above. It should be a separate if statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l.82 here the else if is redundant as we're returning for valid video above. It should be a separate if statement
How is it redundant? I don't understand. Can you please explain it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two conditional checks have no logical coupling or dependency. Since we already return if a valid video stream is found, the code following that if
statement will only execute if the condition was false. So, the else if
is redundant, as the second check is independent and can simply be a separate if.
MediaProcessor/src/Engine.h
Outdated
/** | ||
* @brief Checks if the provided stream is valid video stream | ||
* | ||
* @param stream json data of stream | ||
* | ||
* @return True if the stream is valid. False otherwise. | ||
*/ | ||
bool hasValidVideoStream(const nlohmann::json& stream) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does valid mean here? We should describe what exactly that means here
MediaProcessor/src/Engine.cpp
Outdated
|
||
bool Engine::hasValidVideoStream(const nlohmann::json& stream) const { | ||
if (stream["codec_type"] == "video") { | ||
++videoStreamCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incrementing this here (even after the refactor to make it a local var to getMediaType) would make it more difficult to understand the workflow. This should be done at the call side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incrementing this here (even after the refactor to make it a local var to getMediaType) would make it more difficult to understand the workflow. This should be done at the call side.
Removing this variable. As we will return from getMediaType()
as soon as we will get an actual video stream.(Video stream that is not cover image.)
In case when we encounter video stream with cover image, we will return false from here. In next iteration, when an actual video stream is computed, we will return true after checking avg_frame_rate
of that stream. I do suggest to check for avg_frame_rate
for that stream as well.
I don't think we should return just by video stream count being more than 1. We should check for avg_frame_rate. What do you think about this? let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked the changes.
First off, we're now saying any stream that has the avg frame rate field will be treated as an invalid video stream. What if ffprobe returns sensible values for actual videos with that field? which wouldn't be surprising
Checking if the frame rate is exactly 0/0 and invalidating then, is logical. To determine if a video stream is valid, perhaps you can check non-zero duration in addition.
Also, I don't see the point of refactoring the function name to isActualVideoStream
.
Regarding your question: As soon as we have even one single valid video stream, that would classify the media file as a video and we should return Video. There is no longer a need to check each of the remaining streams.
On the contrary, to determine if an audio file only has a cover attached, we have to ensure that media file has only 1 video stream whose avg frame rate is exactly 0/0.
I hope this answers your question, and please let me know otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off, we're now saying any stream that has the avg frame rate field will be treated as an invalid video stream. What if ffprobe returns sensible values for actual videos with that field? which wouldn't be surprising
Checking if the frame rate is exactly 0/0 and invalidating then, is logical. To determine if a video stream is valid, perhaps you can check non-zero duration in addition.
The current implementation works as described below:
- It checks for the
codec_type
field. It should be video. - It then ensures the
avg_frame_rate
field exists. - And then we check if the
avg_frame_rate
is equal to0/0
which indicates it is a static media.
In all the above cases we return false.
- Otherwise we return true as the stream is now actual video stream.
The thing with the duration field would be problematic, because when we checked earlier, the ffmpeg tool outputs the duration of static media same as audio duration. So, I don't think we should rely on duration field.
The current implementation look good to me. As there is early return when we encounter first ever valid video stream. (We terminate the for loop from that point)
But, we do iterate when the first stream is not valid video stream (static media, cover image).
Let me know if there's something I've misunderstood.
MediaProcessor/src/Engine.cpp
Outdated
} | ||
|
||
bool Engine::hasValidVideoStream(const nlohmann::json& stream) const { | ||
if (stream["codec_type"] == "video") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also adding to my comment below, this should be an early return. If it's not a video stream to begin with, we should immediately return false.
MediaProcessor/src/Engine.cpp
Outdated
if (videoStreamCount > 1 || !hasZeroFrameRate(stream["avg_frame_rate"])) { | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not cover the inexistence of the field, and may cause a crash. Please ensure the field is available and return false otherwise. Similarly, do a return false, instead of nesting under a conditional level
fixes the bug raised in #90
The
ffprobe
tool is used to check if any stream contains codec_name as mjpeg.If mjpeg is present, the media is treated as Audio, other media is treated as Video