-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: update plot correlation functions #104
Conversation
WalkthroughThe pull request introduces multiple new functions and enhancements to the Changes
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/readii/analyze/plot_correlation.py (2)
191-192
: Optional docstring improvement.The return statement is straightforward, but consider adding descriptive text to the docstring near lines 191-192 to provide clarity about the returned values, especially if future maintainers need more context on bin_values versus bin_edges.
225-262
: Save directory & error handling.
- The usage of PlotWriter to control output directories under lines 235-255 is a good approach for consistent file saving.
- The conditional check at lines 248-249 logs a warning if a file already exists, preventing accidental overwrites. This is excellent for data integrity.
- Consider adding a try-except around the actual saving (line 254) if you want to log or handle file I/O errors more gracefully, although the current approach might suffice if PlotWriter handles exceptions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pixi.lock
is excluded by!**/*.lock
📒 Files selected for processing (1)
src/readii/analyze/plot_correlation.py
(3 hunks)
🔇 Additional comments (2)
src/readii/analyze/plot_correlation.py (2)
Line range hint 1-12
: Imports and integration.
These newly added imports for getCrossCorrelations, getSelfCorrelations, and PlotWriter are appropriate and clearly integrate with the existing correlation and plotting code. Ensure that the respective modules are available at runtime and that you maintain consistent versioning of these dependencies within the project.
193-224
: Function naming and usage clarity.
The function name plotSelfCorrHeatmap accurately describes its purpose. The introductory docstring is also consistent with the code. The parameter naming (especially feature_type_name, correlation_method, cmap, save_dir_path) is self-explanatory, enhancing readability.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #104 +/- ##
==========================================
+ Coverage 41.91% 47.66% +5.75%
==========================================
Files 33 33
Lines 1453 1502 +49
==========================================
+ Hits 609 716 +107
+ Misses 844 786 -58 ☔ View full report in Codecov by Sentry. |
This makes checking for a file's existence prior to saving easier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
tests/analyze/test_plot_correlation.py (1)
83-84
: Remove f-string prefixThe string doesn't contain any placeholders, so the f-string prefix is unnecessary.
- assert corr_fig.get_suptitle() == f"Correlation Heatmap", \ + assert corr_fig.get_suptitle() == "Correlation Heatmap", \🧰 Tools
🪛 Ruff (0.8.2)
83-83: f-string without any placeholders
Remove extraneous
f
prefix(F541)
src/readii/analyze/plot_correlation.py (1)
259-572
: Consider performance optimization for large matricesFor large correlation matrices, consider:
- Using numpy's optimized functions for matrix operations
- Implementing caching for frequently accessed correlation data
- Adding progress logging for long-running operations
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/readii/analyze/plot_correlation.py
(5 hunks)src/readii/io/writers/plot_writer.py
(2 hunks)tests/analyze/test_plot_correlation.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/analyze/test_plot_correlation.py
2-2: readii.io.writers.plot_writer.PlotWriter
imported but unused
Remove unused import: readii.io.writers.plot_writer.PlotWriter
(F401)
12-12: pathlib.Path
imported but unused
Remove unused import: pathlib.Path
(F401)
83-83: f-string without any placeholders
Remove extraneous f
prefix
(F541)
🔇 Additional comments (6)
src/readii/io/writers/plot_writer.py (1)
22-25
: LGTM: Well-structured exception handling enhancement
The addition of PlotWriterPlotExistsError
improves error handling specificity for existing plot scenarios.
tests/analyze/test_plot_correlation.py (1)
18-132
: LGTM: Comprehensive test coverage
The test suite provides thorough coverage of the correlation plotting functionality:
- Well-structured fixtures for test data
- Parameterized tests for different correlation types
- Proper validation of output paths and file existence
🧰 Tools
🪛 Ruff (0.8.2)
83-83: f-string without any placeholders
Remove extraneous f
prefix
(F541)
src/readii/analyze/plot_correlation.py (4)
17-49
: LGTM: Well-implemented save functionality
The saveCorrelationHeatmap
function is well-structured with proper:
- Error handling using the new PlotWriterPlotExistsError
- Path construction for different feature types
- Logging of warnings and debug information
287-287
: Fix docstring parameter name mismatch
The docstring references save_path
but the parameter is named save_dir_path
.
- if save_path is not None:
+ if save_dir_path is not None:
Also applies to: 361-361
295-296
: Verify diagonal parameter usage for self-correlation
Setting diagonal=True
for self-correlation heatmaps is correct as it helps visualize the symmetry of self-correlations.
369-370
: LGTM: Correct diagonal parameter for cross-correlation
Setting diagonal=False
for cross-correlation heatmaps is correct as cross-correlations don't have meaningful diagonal elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (2)
tests/analyze/test_plot_correlation.py (1)
68-71
: Avoid hardcoding test values.The test uses hardcoded values (4.0 and 30.0) for bin validation. Consider:
- Using dynamic calculation based on the input data
- Adding tolerance for floating-point comparisons
- Documenting why these specific values are expected
- assert bin_values[0] == 4.0, \ - f"Wrong first bin value, expect 4.0, got {bin_values[0]}" - assert bin_values[-1] == 30.0, \ - f"Wrong last value, expect 30.0, got {bin_values[-1]}" + # Calculate expected values based on the correlation matrix + expected_first_bin = len([x for x in correlation_matrix.values.flatten() if -1.0 <= x < -0.98]) + expected_last_bin = len([x for x in correlation_matrix.values.flatten() if 0.98 <= x <= 1.0]) + + np.testing.assert_allclose(bin_values[0], expected_first_bin, rtol=1e-10, + err_msg=f"Wrong first bin value, expect {expected_first_bin}, got {bin_values[0]}") + np.testing.assert_allclose(bin_values[-1], expected_last_bin, rtol=1e-10, + err_msg=f"Wrong last bin value, expect {expected_last_bin}, got {bin_values[-1]}")src/readii/analyze/plot_correlation.py (1)
487-493
: Consider adding correlation range to histogram title.For better context, consider including the correlation range in the histogram title or subtitle:
cross_corr_hist, _, _ = plotCorrelationHistogram(cross_corr, num_bins=num_bins, xlabel = f"{correlation_method.capitalize()} Correlation", y_lower_bound = y_lower_bound, y_upper_bound = y_upper_bound, - title = f"Distribution of {correlation_method.capitalize()} Cross Correlations", + title = f"Distribution of {correlation_method.capitalize()} Cross Correlations (Range: -1 to 1)", subtitle=f"{vertical_feature_name} vs {horizontal_feature_name}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/readii/analyze/plot_correlation.py
(5 hunks)tests/analyze/test_plot_correlation.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/analyze/test_plot_correlation.py
2-2: readii.io.writers.plot_writer.PlotWriter
imported but unused
Remove unused import: readii.io.writers.plot_writer.PlotWriter
(F401)
16-16: pathlib.Path
imported but unused
Remove unused import: pathlib.Path
(F401)
87-87: f-string without any placeholders
Remove extraneous f
prefix
(F541)
146-146: f-string without any placeholders
Remove extraneous f
prefix
(F541)
159-159: f-string without any placeholders
Remove extraneous f
prefix
(F541)
171-171: f-string without any placeholders
Remove extraneous f
prefix
(F541)
184-184: f-string without any placeholders
Remove extraneous f
prefix
(F541)
🔇 Additional comments (5)
tests/analyze/test_plot_correlation.py (2)
1-2
: Remove unused imports.
The imports PlotWriter
and Path
are not used in the test file.
🧰 Tools
🪛 Ruff (0.8.2)
2-2: readii.io.writers.plot_writer.PlotWriter
imported but unused
Remove unused import: readii.io.writers.plot_writer.PlotWriter
(F401)
22-52
: LGTM! Well-structured test fixtures.
The fixtures provide a clean setup for the test environment with:
- Temporary directory for correlations
- Feature names for vertical and horizontal axes
- Correlation method specification
- Random correlation matrix generation
src/readii/analyze/plot_correlation.py (3)
347-347
:
Fix docstring parameter name mismatch.
The docstring references 'save_path' but the parameter is named 'save_dir_path':
- if save_path is not None:
+ if save_dir_path is not None:
Likely invalid or redundant comment.
287-287
:
Fix docstring parameter name mismatch.
The docstring references 'save_path' but the parameter is named 'save_dir_path':
- if save_path is not None:
+ if save_dir_path is not None:
Likely invalid or redundant comment.
355-357
:
Remove diagonal masking for cross-correlation heatmap.
Setting diagonal=True for cross-correlations is incorrect as the matrix represents correlations between different feature types. The diagonal has no special meaning in this context.
cross_corr_heatmap = plotCorrelationHeatmap(cross_corr,
- diagonal=False,
cmap=cmap,
Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (1)
src/readii/analyze/plot_correlation.py (1)
Line range hint
239-314
: Update return type annotation for plotCorrelationHistogram.The function's return type has changed to include bin values and edges, but the docstring needs updating to match the implementation.
Update the docstring to clearly document the tuple return type:
def plotCorrelationHistogram(...) -> tuple[Figure, np.ndarray, np.ndarray]: """ ... Returns ------- tuple[Figure, np.ndarray, np.ndarray] - Figure: matplotlib figure object containing the histogram - np.ndarray: array of bin values - np.ndarray: array of bin edges """
🧹 Nitpick comments (1)
tests/analyze/test_plot_correlation.py (1)
62-71
: Fix assertion message in histogram test.The assertion messages use f-strings but don't utilize any string interpolation.
Apply this diff to fix the f-string usage:
- f"Wrong first bin value, expect 4.0, got {bin_values[0]}" + "Wrong first bin value, expect 4.0, got {}".format(bin_values[0]) - f"Wrong last value, expect 30.0, got {bin_values[-1]}" + "Wrong last value, expect 30.0, got {}".format(bin_values[-1])
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/readii/analyze/plot_correlation.py
(5 hunks)tests/analyze/test_plot_correlation.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/analyze/test_plot_correlation.py
15-15: pathlib.Path
imported but unused
Remove unused import: pathlib.Path
(F401)
86-86: f-string without any placeholders
Remove extraneous f
prefix
(F541)
145-145: f-string without any placeholders
Remove extraneous f
prefix
(F541)
158-158: f-string without any placeholders
Remove extraneous f
prefix
(F541)
170-170: f-string without any placeholders
Remove extraneous f
prefix
(F541)
183-183: f-string without any placeholders
Remove extraneous f
prefix
(F541)
🪛 GitHub Check: codecov/patch
src/readii/analyze/plot_correlation.py
[warning] 72-73: src/readii/analyze/plot_correlation.py#L72-L73
Added lines #L72 - L73 were not covered by tests
[warning] 76-76: src/readii/analyze/plot_correlation.py#L76
Added line #L76 was not covered by tests
[warning] 131-132: src/readii/analyze/plot_correlation.py#L131-L132
Added lines #L131 - L132 were not covered by tests
[warning] 135-135: src/readii/analyze/plot_correlation.py#L135
Added line #L135 was not covered by tests
[warning] 363-363: src/readii/analyze/plot_correlation.py#L363
Added line #L363 was not covered by tests
[warning] 369-369: src/readii/analyze/plot_correlation.py#L369
Added line #L369 was not covered by tests
[warning] 423-423: src/readii/analyze/plot_correlation.py#L423
Added line #L423 was not covered by tests
[warning] 429-429: src/readii/analyze/plot_correlation.py#L429
Added line #L429 was not covered by tests
[warning] 489-489: src/readii/analyze/plot_correlation.py#L489
Added line #L489 was not covered by tests
[warning] 495-495: src/readii/analyze/plot_correlation.py#L495
Added line #L495 was not covered by tests
[warning] 555-555: src/readii/analyze/plot_correlation.py#L555
Added line #L555 was not covered by tests
[warning] 561-561: src/readii/analyze/plot_correlation.py#L561
Added line #L561 was not covered by tests
🔇 Additional comments (2)
tests/analyze/test_plot_correlation.py (2)
1-19
: LGTM: Clean imports and necessary test dependencies.
The imports are well-organized and include all necessary dependencies for testing the correlation plotting functionality.
🧰 Tools
🪛 Ruff (0.8.2)
15-15: pathlib.Path
imported but unused
Remove unused import: pathlib.Path
(F401)
21-51
: LGTM: Well-structured test fixtures.
The fixtures are well-designed:
- Scoped appropriately at module level
- Generate consistent test data using fixed random seed
- Provide reusable test data for all test cases
def test_save_corr_heatmap(correlation_matrix, correlations_dir, correlation_type, feature_types): | ||
"""Test saving a heatmap from a cross-correlation matrix""" | ||
corr_fig = plotCorrelationHeatmap(correlation_matrix) | ||
|
||
expected_path = correlations_dir / "heatmap" / "nipy_spectral" / ("_vs_".join(feature_types)) + f"_{correlation_type}_correlation_heatmap.png" | ||
|
||
actual_path = saveCorrelationHeatmap(corr_fig, | ||
correlations_dir, | ||
cmap="nipy_spectral", | ||
feature_types=feature_types, | ||
correlation_type=correlation_type) | ||
assert actual_path == expected_path, \ | ||
f"Wrong path returned, expect {expected_path}" | ||
assert actual_path.exists(), \ | ||
"Figure is not being saved to path provided or at all." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add test coverage for error handling.
The test suite lacks coverage for error scenarios in the save functions, particularly for the PlotWriterPlotExistsError
exception.
Add test cases for:
- Attempting to save to an existing path with
overwrite=False
- Attempting to save with invalid paths
- Error handling in the plot generation
Example test case:
def test_save_corr_heatmap_exists():
"""Test saving a heatmap when file already exists."""
with pytest.raises(PlotWriterPlotExistsError):
saveCorrelationHeatmap(corr_fig,
correlation_directory,
cmap="nipy_spectral",
feature_types=["vertical"],
correlation_type="pearson_self",
overwrite=False)
def saveCorrelationHeatmap(plot_figure:Figure, | ||
correlation_directory:Path, | ||
cmap:str, | ||
feature_types:list[str], | ||
correlation_type:str, | ||
overwrite:bool = False) -> Path: | ||
"""Save a heatmap figure to a file with a PlotWriter. | ||
|
||
Parameters | ||
---------- | ||
plot_figure : matplotlib.figure.Figure | ||
The plot to save. | ||
correlation_directory : pathlib.Path | ||
The directory to save the heatmap to. | ||
cmap : str | ||
The colormap used for the heatmap. | ||
feature_types : list[str] | ||
The feature types on the x and y axes of the heatmap. Cross correlatins will be concatenated with "_vs_" to create the title. | ||
correlation_type : str | ||
The correlation method + type used for the heatmap. For example, "pearson_self". | ||
overwrite : bool, optional | ||
Whether to overwrite an existing file. The default is False. | ||
|
||
Returns | ||
------- | ||
Path | ||
The path to the saved file. | ||
|
||
Example | ||
------- | ||
>>> saveCorrelationHeatmap(corr_fig, | ||
correlation_directory=Path("correlations"), | ||
cmap="nipy_spectral", | ||
feature_types=["vertical", "horizontal"], | ||
correlation_type="pearson_cross") | ||
|
||
File will be saved to correlations/heatmap/nipy_spectral/vertical_vs_horizontal_pearson_cross_correlation_heatmap.png | ||
""" | ||
# Set up the writer | ||
corr_heatmap_writer = PlotWriter(root_directory = correlation_directory, | ||
filename_format = "heatmap/{ColorMap}/{FeaturesPlotted}_{CorrelationType}_correlation_heatmap.png", | ||
overwrite = overwrite, | ||
create_dirs = True) | ||
|
||
# Turn feature types into a string | ||
# Single feature type will be in the form "feature_type" | ||
# Multiple feature types will be in the form "feature_type_vs_feature_type" | ||
feature_type_str = "_vs_".join(feature_types) | ||
|
||
# Save the heatmap | ||
try: | ||
return corr_heatmap_writer.save(plot_figure, | ||
ColorMap=cmap, | ||
FeaturesPlotted=feature_type_str, | ||
CorrelationType=correlation_type) | ||
|
||
except PlotWriterPlotExistsError as e: | ||
logger.warning(e) | ||
|
||
# If plot file already exists, return the path to the existing plot | ||
return corr_heatmap_writer.resolve_path(ColorMap=cmap, | ||
FeaturesPlotted=feature_type_str, | ||
CorrelationType=correlation_type) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for invalid paths and improve test coverage.
The saveCorrelationHeatmap
function has uncovered error handling code:
- Lines 72-73: Exception handling
- Line 76: Path resolution
- Add validation for the input parameters
- Add test coverage for error scenarios
- Consider adding a custom exception for invalid paths
Example validation:
if not isinstance(correlation_directory, Path):
raise TypeError("correlation_directory must be a Path object")
if not isinstance(feature_types, list) or not all(isinstance(x, str) for x in feature_types):
raise TypeError("feature_types must be a list of strings")
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 72-73: src/readii/analyze/plot_correlation.py#L72-L73
Added lines #L72 - L73 were not covered by tests
[warning] 76-76: src/readii/analyze/plot_correlation.py#L76
Added line #L76 was not covered by tests
def plotSelfCorrHeatmap(correlation_matrix:pd.DataFrame, | ||
feature_type_name:str, | ||
correlation_method:str = "pearson", | ||
cmap:str='nipy_spectral', | ||
save_dir_path:Optional[str] = None) -> tuple[Figure | Figure, Path]: | ||
"""Plot a heatmap of the self correlations from a correlation matrix. | ||
|
||
Parameters | ||
---------- | ||
correlation_matrix : pd.DataFrame | ||
Dataframe containing the correlation matrix to plot. | ||
feature_type_name : str | ||
Name of the feature type to get self correlations for. Must be the suffix of some feature names in the correlation matrix. | ||
correlation_method : str, optional | ||
Method to use for calculating correlations. Default is "pearson". | ||
cmap : str, optional | ||
Colormap to use for the heatmap. Default is "nipy_spectral". | ||
save_dir_path : str, optional | ||
Path to save the heatmap to. If None, the heatmap will not be saved. Default is None. | ||
File will be saved to {save_dir_path}/heatmap/{cmap}/{feature_type_name}_{correlation_method}_self_correlation_heatmap.png | ||
|
||
Returns | ||
------- | ||
self_corr_heatmap : matplotlib.pyplot.figure | ||
Figure object containing the heatmap of the self correlations. | ||
if save_path is not None: | ||
self_corr_save_path : Path | ||
Path to the saved heatmap. | ||
""" | ||
# Get the self correlations for the specified feature type | ||
self_corr = getSelfCorrelations(correlation_matrix, feature_type_name) | ||
|
||
# Make the heatmap figure | ||
self_corr_heatmap = plotCorrelationHeatmap(self_corr, | ||
diagonal=True, | ||
cmap=cmap, | ||
xlabel=feature_type_name, | ||
ylabel=feature_type_name, | ||
title=f"{correlation_method.capitalize()} Self Correlations", | ||
subtitle=f"{feature_type_name}") | ||
|
||
if save_dir_path is not None: | ||
# Save the heatmap to a png file | ||
self_corr_save_path = saveCorrelationHeatmap(self_corr_heatmap, | ||
save_dir_path, | ||
cmap=cmap, | ||
feature_types=[feature_type_name], | ||
correlation_type=f"{correlation_method}_self") | ||
# Return the figure and path to the saved heatmap | ||
return self_corr_heatmap, self_corr_save_path | ||
|
||
else: | ||
# Return the figure without saving | ||
return self_corr_heatmap | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Fix docstring parameter references and return type annotations.
Issues in both plotSelfCorrHeatmap
and plotCrossCorrHeatmap
:
- Docstring references
save_path
but parameter issave_dir_path
- Return type annotation uses
|
instead of,
in tuple
Apply these fixes:
- Update parameter references in docstrings
- Fix return type annotations:
--> tuple[Figure | Figure, Path]:
+--> tuple[Figure, Path]:
Also applies to: 377-434
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 363-363: src/readii/analyze/plot_correlation.py#L363
Added line #L363 was not covered by tests
[warning] 369-369: src/readii/analyze/plot_correlation.py#L369
Added line #L369 was not covered by tests
|
||
Returns | ||
------- | ||
self_corr_hist : plt.Figure | ||
Figure object containing the histogram of the self correlations. | ||
if save_dir_path is not None: | ||
self_corr_save_path : Path | ||
Path to the saved histogram. | ||
""" | ||
# Get the self correlations for the specified feature type | ||
self_corr = getSelfCorrelations(correlation_matrix, feature_type_name) | ||
|
||
# Make the histogram figure | ||
self_corr_hist, _, _ = plotCorrelationHistogram(self_corr, | ||
num_bins=num_bins, | ||
xlabel = f"{correlation_method.capitalize()} Self Correlations", | ||
y_lower_bound = y_lower_bound, | ||
y_upper_bound = y_upper_bound, | ||
title = f"Distribution of {correlation_method.capitalize()} Self Correlations", | ||
subtitle = f"{feature_type_name}") | ||
|
||
if save_dir_path is not None: | ||
# Save the histogram to a png file | ||
self_corr_save_path = saveCorrelationHistogram(self_corr_hist, | ||
save_dir_path, | ||
feature_types=[feature_type_name], | ||
correlation_type=f"{correlation_method}_self") | ||
|
||
# Return the figure and the path to the saved histogram | ||
return self_corr_hist, self_corr_save_path | ||
|
||
else: | ||
# Return the histogram figure | ||
return self_corr_hist | ||
|
||
|
||
|
||
def plotCrossCorrHistogram(correlation_matrix:pd.DataFrame, | ||
vertical_feature_name:str, | ||
horizontal_feature_name:str, | ||
correlation_method:str = "pearson", | ||
num_bins:int = 100, | ||
y_lower_bound:int = 0, | ||
y_upper_bound:int = None, | ||
save_dir_path:Optional[str] = None) -> tuple[Figure | Figure, Path]: | ||
"""Plot a histogram of the cross correlations from a correlation matrix. | ||
|
||
Parameters | ||
---------- | ||
correlation_matrix : pd.DataFrame | ||
Dataframe containing the correlation matrix to plot. | ||
vertical_feature_name : str | ||
Name of the vertical feature type to get self correlations for. Must be the suffix of some feature names in the correlation matrix index. | ||
horizontal_feature_name : str | ||
Name of the horizontal feature type to get self correlations for. Must be the suffix of some feature names in the correlation matrix columns. | ||
correlation_method : str, optional | ||
Method to use for calculating correlations. Default is "pearson". | ||
num_bins : int, optional | ||
Number of bins to use for the histogram generation. The default is 100. | ||
y_lower_bound : int, optional | ||
Lower bound for the y-axis of the histogram. The default is 0. | ||
y_upper_bound : int, optional | ||
Upper bound for the y-axis of the histogram. The default is None. | ||
save_dir_path : str, optional | ||
Path to save the histogram to. If None, the histogram will not be saved. Default is None. | ||
File will be saved to {save_dir_path}/histogram/{vertical_feature_name}_vs_{horizontal_feature_name}_{correlation_method}_cross_correlation_histogram.png | ||
|
||
Returns | ||
------- | ||
cross_corr_hist : plt.Figure | ||
Figure object containing the histogram of the cross correlations. | ||
if save_dir_path is not None: | ||
cross_corr_save_path : Path | ||
Path to the saved histogram. | ||
""" | ||
# Get the cross correlations for the specified feature type | ||
cross_corr = getCrossCorrelations(correlation_matrix, vertical_feature_name, horizontal_feature_name) | ||
|
||
# Make the histogram figure | ||
cross_corr_hist, _, _ = plotCorrelationHistogram(cross_corr, | ||
num_bins=num_bins, | ||
xlabel = f"{correlation_method.capitalize()} Correlation", | ||
y_lower_bound = y_lower_bound, | ||
y_upper_bound = y_upper_bound, | ||
title = f"Distribution of {correlation_method.capitalize()} Cross Correlations", | ||
subtitle=f"{vertical_feature_name} vs {horizontal_feature_name}") | ||
|
||
if save_dir_path is not None: | ||
# Save the histogram to a png file | ||
cross_corr_save_path = saveCorrelationHistogram(cross_corr_hist, | ||
save_dir_path, | ||
feature_types=[vertical_feature_name, horizontal_feature_name], | ||
correlation_type=f"{correlation_method}_cross") | ||
|
||
# Return the figure and the path to the saved histogram | ||
return cross_corr_hist, cross_corr_save_path | ||
|
||
else: | ||
# Return the histogram figure | ||
return cross_corr_hist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add input validation and improve error handling in histogram functions.
Both plotSelfCorrHistogram
and plotCrossCorrHistogram
lack input validation:
- No validation for negative
num_bins
- No validation for
y_lower_bound
>y_upper_bound
- No validation for feature name existence
Add input validation:
def validate_histogram_params(num_bins: int, y_lower_bound: int, y_upper_bound: Optional[int]) -> None:
if num_bins <= 0:
raise ValueError("num_bins must be positive")
if y_upper_bound is not None and y_lower_bound > y_upper_bound:
raise ValueError("y_lower_bound cannot be greater than y_upper_bound")
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 489-489: src/readii/analyze/plot_correlation.py#L489
Added line #L489 was not covered by tests
[warning] 495-495: src/readii/analyze/plot_correlation.py#L495
Added line #L495 was not covered by tests
[warning] 555-555: src/readii/analyze/plot_correlation.py#L555
Added line #L555 was not covered by tests
[warning] 561-561: src/readii/analyze/plot_correlation.py#L561
Added line #L561 was not covered by tests
add self and cross correlation plot functions
Summary by CodeRabbit