Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Report processor supports cumulative and total measurements with different sets of EDP combinations. #1885

Conversation

ple13
Copy link
Contributor

@ple13 ple13 commented Oct 29, 2024

No description provided.

@wfa-reviewable
Copy link

This change is Reviewable

@SanjayVas SanjayVas removed their request for review October 29, 2024 18:51
@SanjayVas
Copy link
Member

Removing myself as reviewer since the vast majority of this is in the Python processing code.

Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 6 files at r1, 1 of 3 files at r2, 2 of 3 files at r3, all commit messages.
Reviewable status: 5 of 6 files reviewed, 11 unresolved discussions (waiting on @ple13)


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 39 at r3 (raw file):

def get_cover_relationships(edp_combinations: list[FrozenSet[str]]):

I would tend to decompose this a bit and provide the following functions:

isCover(s, possible_cover):

getCovers(s, other_sets):

If you have these it will be much easier to read and test.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 77 at r3 (raw file):

  __reach_time_series_by_edp_combination: dict[
    FrozenSet[str], list[Measurement]]
  __reach_whole_campaign_by_edp_combination: dict[FrozenSet[str], Measurement]

You can drop these. It's enough to assign to them in the init function, however you should document them in the doc string of the class.

Also, prefer single underscores for members.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 84 at r3 (raw file):

        FrozenSet[str], list[Measurement]],
      reach_whole_campaign_by_edp_combination: dict[
        FrozenSet[str], Measurement] = None,

Do you really need to default this? Won't your client always pass it?


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 99 at r3 (raw file):

    )

    if reach_whole_campaign_by_edp_combination is None:

What you're doing is fine, especially for such a long variable name, but writing the following is better:

member = {} if input_var is None else input_var

Shortening the member variable names wouldn't hurt, but in a separate PR, if you want to.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 130 at r3 (raw file):

    return self.__reach_whole_campaign_by_edp_combination[edp_combination]

  def get_cumulative_edp_combinations(self):

These aren't returning/operating on combinations. They seem to just be return a set of the keys and out counting the number of keys of the various dicts.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 146 at r3 (raw file):

  def get_cumulative_subset_relationships(self):
    edp_combinations = list(self.__reach_time_series_by_edp_combination)

No need for the temporary.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 150 at r3 (raw file):

  def get_whole_campaign_subset_relationships(self):
    edp_combinations = list(self.__reach_whole_campaign_by_edp_combination)

Same.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 154 at r3 (raw file):

  def get_cumulative_cover_relationships(self):
    edp_combinations = list(self.__reach_time_series_by_edp_combination)

Same.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 158 at r3 (raw file):

  def get_whole_campaign_cover_relationships(self):
    edp_combinations = list(self.__reach_whole_campaign_by_edp_combination)

Same.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 345 at r3 (raw file):

        )

  def __add_subset_relations_to_spec(self, spec):

It sure would be nice to have tests for each of these functions.


src/main/python/wfa/measurement/reporting/postprocessing/tools/post_process_origin_report.py line 234 at r3 (raw file):

def buildCorrectedExcel(correctedReport, excel):

Do we still need the excel functions?

Copy link
Contributor Author

@ple13 ple13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 7 files reviewed, 11 unresolved discussions (waiting on @kungfucraig)


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 39 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

I would tend to decompose this a bit and provide the following functions:

isCover(s, possible_cover):

getCovers(s, other_sets):

If you have these it will be much easier to read and test.

Done.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 77 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

You can drop these. It's enough to assign to them in the init function, however you should document them in the doc string of the class.

Also, prefer single underscores for members.

Done.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 84 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Do you really need to default this? Won't your client always pass it?

This doesn't need a default value. I had it so that I didn't need to update previous unit tests. Anyway, I've updated the code and remove the default value.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 99 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

What you're doing is fine, especially for such a long variable name, but writing the following is better:

member = {} if input_var is None else input_var

Shortening the member variable names wouldn't hurt, but in a separate PR, if you want to.

Thanks. I've updated it.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 130 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

These aren't returning/operating on combinations. They seem to just be return a set of the keys and out counting the number of keys of the various dicts.

The keys of the dictionaries here are the edp_combinations. e.g. self._reach_whole_campaign_by_edp_combination is a dictionary that maps edp_combination to reach measurements.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 146 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

No need for the temporary.

Done.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 150 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Same.

Done.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 154 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Same.

Done.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 158 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Same.

Done.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 345 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

It sure would be nice to have tests for each of these functions.

Done.


src/main/python/wfa/measurement/reporting/postprocessing/tools/post_process_origin_report.py line 234 at r3 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Do we still need the excel functions?

We don't need them any more. I've removed them.

@ple13 ple13 requested a review from stevenwarejones November 4, 2024 18:18
Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r4, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @ple13 and @stevenwarejones)


src/main/python/wfa/measurement/reporting/postprocessing/noiseninja/solver.py line 177 at r4 (raw file):

                          problem=self._problem())
    else:
      while attempt_count < 10:

Introduce a file level constant for this.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 53 at r4 (raw file):

      lambda x, y: x.union(y), possible_cover
  )
  if union_of_possible_cover == target_set:

return union_of_possible_cover == target_set

Copy link
Contributor Author

@ple13 ple13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 7 files reviewed, 3 unresolved discussions (waiting on @kungfucraig and @stevenwarejones)


src/main/python/wfa/measurement/reporting/postprocessing/noiseninja/solver.py line 177 at r4 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Introduce a file level constant for this.

Done.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 53 at r4 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

return union_of_possible_cover == target_set

Done.

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about using pyre for type checking?

Reviewed 1 of 6 files at r1, 1 of 6 files at r4, 1 of 2 files at r5, all commit messages.
Reviewable status: 6 of 7 files reviewed, 3 unresolved discussions (waiting on @kungfucraig)

Copy link
Member

@SanjayVas SanjayVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rieman had done an analysis on how to add Python infrastructure to Halo repos back when Planning was still a priority. IIRC, the main issue was the lack of Bazel support for all of the common type checkers.

FWIW internally at Google we use PyType

Reviewed 1 of 6 files at r1.
Reviewable status: 6 of 7 files reviewed, 3 unresolved discussions (waiting on @kungfucraig)

Copy link
Contributor Author

@ple13 ple13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created an issue for the type checking #1912

Reviewable status: 6 of 7 files reviewed, 3 unresolved discussions (waiting on @kungfucraig)

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r5.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @kungfucraig and @ple13)


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 26 at r5 (raw file):

def get_subset_relationships(edp_combinations: list[FrozenSet[str]]):

i'd prefer all return types be explicitly stated - you can probably use MonkeyType to generate a lot of these for you
https://github.com/Instagram/MonkeyType

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @kungfucraig and @ple13)


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 55 at r5 (raw file):

  return union_of_possible_cover == target_set

def get_covers(target_set, other_sets):

can you explicitly type all the input and output params of functions?

Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r5.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @ple13)

Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r6, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ple13)

Copy link
Contributor Author

@ple13 ple13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 8 files reviewed, 2 unresolved discussions (waiting on @kungfucraig and @stevenwarejones)


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 26 at r5 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

i'd prefer all return types be explicitly stated - you can probably use MonkeyType to generate a lot of these for you
https://github.com/Instagram/MonkeyType

Done.


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 55 at r5 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

can you explicitly type all the input and output params of functions?

Done.

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 6 files at r4, 2 of 5 files at r6, 2 of 2 files at r7, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ple13)


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 222 at r7 (raw file):

                                   between metrics. Each key is a parent metric,
                                   and the value is a list of its child metrics.
        _cumulative_inconsistency_allowed_edp_combinations: A set of EDP

whats the use case for allowing inconsistencies with only some EDPs?


src/test/python/wfa/measurement/reporting/postprocessing/report/test_report.py line 1287 at r7 (raw file):

    )

    corrected = report.get_corrected_report()

can you add some comments to each of these corrections what you are checking was properly corrected?

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 6 files at r4, 1 of 5 files at r6.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ple13)

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ple13)


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 222 at r7 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

whats the use case for allowing inconsistencies with only some EDPs?

please add a comment that this is for TV

Copy link
Contributor Author

@ple13 ple13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 8 files reviewed, all discussions resolved (waiting on @kungfucraig and @stevenwarejones)


src/main/python/wfa/measurement/reporting/postprocessing/report/report.py line 222 at r7 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

please add a comment that this is for TV

Done.


src/test/python/wfa/measurement/reporting/postprocessing/report/test_report.py line 1287 at r7 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

can you add some comments to each of these corrections what you are checking was properly corrected?

Done.

Copy link
Contributor Author

@ple13 ple13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 6 files at r1, 1 of 6 files at r4, 3 of 5 files at r6, 1 of 2 files at r7, 2 of 2 files at r8, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @ple13)

@ple13 ple13 merged commit 4e0ca03 into main Dec 2, 2024
4 checks passed
@ple13 ple13 deleted the lephi-separate-cumulative-measurements-from-whole-campaign-measurements branch December 2, 2024 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants