Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace obsolete terminology TODO: fix go.mod for tapir repo #81

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

zluudg
Copy link
Contributor

@zluudg zluudg commented Jan 14, 2025

Fix #79

Summary by CodeRabbit

Release Notes

  • Terminology Update

    • Replaced "whitelist" with "allowlist"
    • Replaced "blacklist" with "denylist"
    • Replaced "greylist" with "doubtlist"
  • Configuration Changes

    • Updated policy, sources, and output configuration files
    • Renamed list types and actions across multiple configuration files
  • Code Refactoring

    • Updated method and variable names to reflect new terminology
    • Maintained existing logic and functionality
    • Improved consistency in naming conventions
  • Documentation

    • Updated README with new terminology
    • Revised comments and logging messages

This release focuses on updating terminology to be more inclusive and clear, without changing the core functionality of the system.

@zluudg zluudg requested a review from a team as a code owner January 14, 2025 14:59
Copy link

coderabbitai bot commented Jan 14, 2025

Walkthrough

This comprehensive pull request focuses on modernizing the terminology used for domain lists across the entire TAPIR-POP project. The changes systematically replace outdated terms like "whitelist", "blacklist", and "greylist" with more inclusive alternatives: "allowlist", "denylist", and "doubtlist". This renaming effort spans multiple files, including configuration files, source code, and documentation, ensuring consistent terminology throughout the project.

Changes

File Change Summary
README.md Terminology updated from whitelist/blacklist/greylist to allowlist/denylist/doubtlist
apihandler.go API commands and responses updated with new list terminology
bootstrap.go Method and variable names updated to use doubtlist
config.go Configuration structures renamed to reflect new list types
lists.go Methods renamed to match new terminology (Whitelisted → Allowlisted, etc.)
policy.go Action and method names updated with new list terminology
Sample YAML files List types and source names updated to new terminology
refreshengine.go Domain management terminology updated to new standards
rpz.go Logic for RPZ data generation updated to reflect new list names
sources.go List management terminology updated across various sources
structs.go Fields and types renamed to align with new terminology
go.mod Dependency version updated for tapir module

Assessment against linked issues

Objective Addressed Explanation
Modernize list terminology
Replace "Blacklist" with "Denylist"
Replace "Whitelist" with "Allowlist"
Replace "Greylist" with "Doubtlist"

Poem

🐰 A Rabbit's Renaming Rhyme 🐰

From black and white to shades of doubt,
Our lists have shed their old-school route
Allowlist, denylist, with care we write
Inclusive terms now shine so bright!

Hop hop hooray for language new! 🌈

Finishing Touches

  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (9)
policy.go (1)

106-108: Fix typos in function comments.

The comments contain typos that should be fixed.

-// Note: we onlygethere when we know that this name is only doubtlisted
-// so no need tocheckfor allow- or denylisting
+// Note: we only get here when we know that this name is only doubtlisted
+// so no need to check for allow- or denylisting
rpz.go (1)

33-34: Consider adding debug logging for list sizes.

While the logging is informative, consider adding the size of the lists being processed to help with monitoring and debugging.

-	pd.Logger.Printf("---> GenerateRpzAxfr: working on denylist %s (%d names)",
-		bname, len(blist.Names))
+	pd.Logger.Printf("---> GenerateRpzAxfr: working on denylist %s (%d names, %d total denylisted)",
+		bname, len(blist.Names), len(deny))

Also applies to: 57-58

sources.go (2)

71-82: Add validation for policy configuration.

The policy configuration lacks validation for the action strings. Consider adding validation to ensure only valid actions are accepted.

+func validatePolicyAction(action string) error {
+    validActions := map[string]bool{
+        "allow": true,
+        "deny": true,
+        "doubt": true,
+    }
+    if !validActions[action] {
+        return fmt.Errorf("invalid policy action: %s", action)
+    }
+    return nil
+}

384-388: Update comment formatting for better readability.

The multi-line comment about special cases could be formatted better for readability.

-// Parse the CNAME (in the shape of a dns.RR) that is found in the RPZ and sort the data into the
-// appropriate list in PopData. Note that there are two special cases:
-//  1. If a "allowlist" RPZ source has a rule with an action other than "rpz-passthru." then that rule doesn't
-//     really belong in a "allowlist" source. So we take that rule an put it in the doubt_catchall bucket instead.
-//  2. If a "{doubt|deny}list" RPZ source has a rule with an "rpz-passthru." (i.e. allowlist) action then that
-//     rule doesn't really belong in a "{doubt|deny}list" source. So we take that rule an put it in the
-//     allow_catchall bucket instead.
+// Parse the CNAME (dns.RR) found in the RPZ and sort into appropriate PopData list.
+// Special cases:
+// 1. Allowlist source with non-passthru action:
+//    - Rule doesn't belong in allowlist
+//    - Move to doubt_catchall bucket
+//
+// 2. Doubt/Denylist source with passthru action:
+//    - Rule doesn't belong in doubt/denylist
+//    - Move to allow_catchall bucket
refreshengine.go (2)

348-350: Consider consolidating duplicate error handling.

The error handling for allowlist and denylist checks follows the same pattern. Consider extracting to a helper function.

+func (pd *PopData) checkListStatus(domain string) (string, bool) {
+    if pd.Allowlisted(domain) {
+        return fmt.Sprintf("Domain name \"%s\" is allowlisted. No change.", domain), true
+    }
+    if pd.Denylisted(domain) {
+        return fmt.Sprintf("Domain name \"%s\" is already denylisted. No change.", domain), true
+    }
+    return "", false
+}

Also applies to: 356-358


365-372: Add logging for doubtlist operations.

The doubtlist operations lack detailed logging that could help with debugging.

 if cmd.ListType == "doubtlist" {
+    pd.Logger.Printf("Processing doubtlist operation for domain %s with policy %s", cmd.Domain, cmd.Policy)
     _, err := pd.DoubtlistAdd(cmd.Domain, cmd.Policy, cmd.RpzSource)
     if err != nil {
+        pd.Logger.Printf("Failed to add domain %s to doubtlist: %v", cmd.Domain, err)
         resp.Error = true
         resp.ErrorMsg = fmt.Sprintf("Error adding domain name \"%s\" to doubtlisting DB: %v", cmd.Domain, err)
     } else {
+        pd.Logger.Printf("Successfully added domain %s to doubtlist", cmd.Domain)
         resp.Msg = fmt.Sprintf("Domain name \"%s\" (policy %s) added to doubtlisting DB.",
             cmd.Domain, cmd.Policy)
     }
apihandler.go (1)

445-446: Update case name for clarity.

The case name change from "colourlists" to "filterlists" is good, but consider using a more descriptive name.

-case "filterlists":
+case "list-filters":
pop-sources.sample.yaml (1)

56-56: Fix YAML formatting issues.

Remove trailing spaces and ensure consistent indentation using spaces instead of tabs.

Also applies to: 61-61

🧰 Tools
🪛 yamllint (1.35.1)

[error] 56-56: trailing spaces

(trailing-spaces)

README.md (1)

20-20: Fix minor grammatical issues in the documentation.

  1. Line 20: Use "e.g." instead of "eg."
  2. Line 53: Consider rephrasing "is able to" to "can"
  3. Line 72: Fix subject-verb agreement: "name that appears" instead of "name that appear"

Also applies to: 53-53, 72-72

🧰 Tools
🪛 LanguageTool

[uncategorized] ~20-~20: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... where sources can even be conflicting (eg. a domainname may be flagged by one sour...

(E_G)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 873a839 and c5eabf4.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (17)
  • README.md (4 hunks)
  • apihandler.go (5 hunks)
  • bootstrap.go (5 hunks)
  • config.go (2 hunks)
  • configupdater.go (1 hunks)
  • go.mod (1 hunks)
  • lists.go (4 hunks)
  • mqtt.go (1 hunks)
  • policy.go (3 hunks)
  • pop-outputs.sample.yaml (1 hunks)
  • pop-policy.sample.yaml (1 hunks)
  • pop-sources.sample.yaml (1 hunks)
  • reaper.go (1 hunks)
  • refreshengine.go (2 hunks)
  • rpz.go (6 hunks)
  • sources.go (8 hunks)
  • structs.go (2 hunks)
🧰 Additional context used
🪛 yamllint (1.35.1)
pop-policy.sample.yaml

[error] 8-8: syntax error: found character '\t' that cannot start any token

(syntax)

pop-sources.sample.yaml

[error] 4-4: syntax error: found character '\t' that cannot start any token

(syntax)


[error] 56-56: trailing spaces

(trailing-spaces)


[error] 61-61: trailing spaces

(trailing-spaces)

🪛 LanguageTool
README.md

[uncategorized] ~20-~20: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... where sources can even be conflicting (eg. a domainname may be flagged by one sour...

(E_G)


[style] ~53-~53: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...remely compact data structures. TEM is able to mmap very large lists in DAWG format wh...

(BE_ABLE_TO)


[style] ~53-~53: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...ata structures. TEM is able to mmap very large lists in DAWG format which is used for ...

(EN_WEAK_ADJECTIVE)


[uncategorized] ~53-~53: Possible missing comma found.
Context: ...s able to mmap very large lists in DAWG format which is used for large allowlists. -...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~72-~72: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...ction. - the same doubtlisted name that appear in N distinct intelligence feeds is i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

🔇 Additional comments (25)
structs.go (3)

74-76: LGTM! Action field renaming is consistent.

The renaming of action fields in PopPolicy struct follows the same pattern, maintaining consistency across the codebase.


79-85: LGTM! Type and field renaming is consistent.

The renaming of the policy type and its fields maintains the same structure while using the updated terminology.

Let's verify if there are any missed references to the old type name:

✅ Verification successful

Renaming verification complete - all references updated

The comprehensive search across the codebase confirms that all references to "GreylistPolicy" and related terms have been successfully updated to the new "DoubtlistPolicy" naming convention.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining occurrences of the old type name
rg -i "greylistpolicy"

Length of output: 22


Script:

#!/bin/bash
# Case-sensitive search for GreylistPolicy and related terms
rg -g '!*.git' 'GreylistPolicy'
rg -g '!*.git' '[Gg]rey[Ll]ist'

# Search for struct initialization patterns that might use the old name
ast-grep --pattern 'GreylistPolicy{$$$}'

# Search for any documentation or comments containing the term
rg -g '!*.git' -i 'grey.?list' -t md
rg -g '!*.git' -i 'grey.?list' -g '*.go' -g '*.txt' -g '*.doc*'

Length of output: 213


28-29: LGTM! Field renaming is consistent.

The renaming of fields from BlacklistedNames to DenylistedNames and GreylistedNames to DoubtlistedNames aligns with the PR objectives.

Let's verify if there are any missed references to these fields:

✅ Verification successful

✓ Terminology change verified: No remaining instances of old terminology

The codebase has been thoroughly updated with the new terminology. All references to the old terms have been properly replaced with deny and doubt variants consistently across all files.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining occurrences of old terminology
rg -i "blacklistednames|greylistednames" 

Length of output: 42


Script:

#!/bin/bash
# Search with broader patterns including variations
echo "=== Searching for old terminology ==="
rg -i "black.?list|grey.?list"

echo -e "\n=== Searching for new terminology ==="
rg -i "deny.?list|doubt.?list"

echo -e "\n=== Searching specifically in structs.go ==="
cat structs.go

Length of output: 16617

lists.go (3)

Line range hint 15-34: LGTM! Function and message renaming is consistent.

The renaming of the Allowlisted function (from Whitelisted) and its associated messages maintains consistency with the new terminology.


Line range hint 37-52: LGTM! Function and message renaming is consistent.

The renaming of the Denylisted function (from Blacklisted) and its associated messages follows the same pattern.


Line range hint 56-71: LGTM! Function and message renaming is consistent.

The renaming of the Doubtlisted function and related functions (from Greylisted) maintains consistency. All associated messages and error strings have been updated accordingly.

Let's verify if any old terminology remains in error messages or comments:

Also applies to: 75-87, 91-94

✅ Verification successful

Complete and consistent terminology update confirmed

The codebase shows thorough and consistent use of the new "doubtlist" terminology across all functions, variables, comments, error messages, and configuration keys. No instances of the old terminology ("greylist" or variations) were found.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining occurrences of old terminology in strings
rg -i "whitelist|blacklist|greylist" --type go

Length of output: 48


Script:

#!/bin/bash
# Verify command works by searching for known term
echo "=== Searching for current terminology ==="
rg -i "doubtlist" --type go

echo -e "\n=== Searching for old terminology with variations ==="
rg -i "grey|gray|white|black" --type go

echo -e "\n=== Checking file names for old terminology ==="
fd -e go . | grep -i "grey\|gray\|white\|black"

Length of output: 10092

reaper.go (1)

24-24: Consider the order of list types.

While the renaming is consistent, consider if the order of list types (allowlist, doubtlist, denylist) affects the processing logic. The original order might have been intentional for policy application.

Let's check if the order of list processing is significant:

✅ Verification successful

List type order does not affect processing logic

The codebase processes each list type independently through dedicated functions. The order in the string slice is used only for configuration iteration and does not impact the policy application or decision-making logic.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any code that might depend on list processing order
rg -B 5 -A 5 'for.*range.*\["(whitelist|blacklist|greylist|allowlist|denylist|doubtlist)"\]'

Length of output: 5813

configupdater.go (1)

80-81: ⚠️ Potential issue

Verify the impact of removing multi-list iteration.

The change from iterating over all list types to only processing doubtlist appears to be a significant functional change, not just a terminology update. This might affect how configuration updates are processed for other list types.

Let's verify the impact:

Consider restoring the multi-list iteration with updated terminology:

-//for _, listtype := range []string{"allowlist", "denylist", "doubtlist"} {
-for _, wbgl := range pd.Lists["doubtlist"] {
+for _, listtype := range []string{"allowlist", "denylist", "doubtlist"} {
+    for _, wbgl := range pd.Lists[listtype] {
config.go (2)

106-112: LGTM! Consistent terminology update in PolicyConf.

The field names have been appropriately updated to use more inclusive terminology:

  • Allowlist replaces Whitelist
  • Denylist replaces Blacklist
  • Doubtlist replaces Greylist

Line range hint 118-131: LGTM! DoubtlistConf type definition is consistent.

The new type maintains the same structure while using updated terminology:

  • DenyTapir replaces BlackTapir
bootstrap.go (2)

73-73: LGTM! API commands and error messages consistently updated.

Commands and error messages have been updated to use the new terminology:

  • "doubtlist-status" replaces "greylist-status"
  • Error messages now reference "doubtlist" instead of "greylist"

Also applies to: 91-91, 95-95


104-104: LGTM! Variable names and logging messages consistently updated.

All references have been updated to use "doubtlist":

  • Command string "export-doubtlist"
  • Variable declaration and usage
  • Logging messages

Also applies to: 119-119, 121-121, 123-123, 142-142, 143-143, 145-145, 152-152

mqtt.go (1)

77-77: LGTM! Switch statement cases updated with new terminology.

The case labels have been appropriately updated to:

  • "allowlist" (from "whitelist")
  • "denylist" (from "blacklist")
  • "doubtlist" (from "greylist")
policy.go (3)

24-24: LGTM! Type field documentation updated.

Documentation updated to reference "doubtlist" as the usual listtype.


110-145: LGTM! ComputeRpzDoubtlistAction implementation consistently updated.

All references have been updated to use the new terminology:

  • Variable names (doubtHits)
  • Error messages
  • Log messages
  • Policy references

170-186: LGTM! ComputeRpzAction implementation consistently updated.

The function has been updated to use the new terminology throughout:

  • Condition checks (Allowlisted, Denylisted, Doubtlisted)
  • Log messages
  • Action references (AllowlistAction, DenylistAction)
rpz.go (2)

30-31: LGTM! Map initialization looks good.

The map initialization with a capacity hint of 10000 is a good practice for performance optimization.


76-81: Verify the tag mask combination logic.

The tag mask combination logic needs careful verification to ensure it correctly merges the tag masks from different sources.

✅ Verification successful

Tag mask combination logic is correctly implemented

The bitwise OR operation is the appropriate method for combining tag masks, ensuring all flags from both sources are preserved. The implementation is properly guarded and consistent with standard bit mask operations.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for all tag mask operations to verify consistency
rg -A 3 "TagMask.*[|&^]"

Length of output: 183


Script:

#!/bin/bash
# Find TagMask type definition and related constants
ast-grep --pattern 'type $_ struct {
  $$$
  TagMask $_
  $$$
}'

# Find any constants or values related to TagMask
rg "TagMask.*=" -B 2 -A 2

# Get more context around the tag mask combination
rg -B 10 -A 10 "tmp.TagMask = doubt"

Length of output: 1532

sources.go (1)

51-53: LGTM! List initialization is consistent.

The initialization of allowlist, doubtlist, and denylist maps is consistent and well-structured.

apihandler.go (1)

472-473: LGTM! Response field updates are consistent.

The response field updates from GreylistedNames to DoubtlistedNames maintain consistency with the new terminology.

pop-outputs.sample.yaml (1)

12-12: LGTM! Terminology updates are consistent.

The changes from greylist to doubtlist for both MQTT and HTTP outputs are consistent with the new terminology.

Also applies to: 20-20

pop-policy.sample.yaml (1)

3-3: LGTM! Policy terminology updates are consistent.

The changes from whitelist/blacklist/greylist to allowlist/denylist/doubtlist in the policy configuration are consistent with the new terminology.

Also applies to: 7-11

pop-sources.sample.yaml (1)

4-4: LGTM! Source configuration updates are comprehensive and consistent.

The changes include:

  1. Renaming sources to remove color-based terminology
  2. Updating type fields to use the new terminology
  3. Updating descriptions to match the new terms

All changes are consistent with the project's new terminology standards.

Also applies to: 8-8, 16-16, 24-24, 31-32, 40-41, 46-48, 54-55, 62-62, 68-69

🧰 Tools
🪛 yamllint (1.35.1)

[error] 4-4: syntax error: found character '\t' that cannot start any token

(syntax)

go.mod (1)

6-6: Verify the tapir dependency update.

The tapir dependency has been updated to a newer version. This update likely contains the corresponding terminology changes in the tapir repo.

✅ Verification successful

Tapir dependency update verified

The update to github.com/dnstapir/tapir@18cc0cdb5c33 is confirmed to contain the terminology modernization changes from PR #22.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the tapir dependency version exists and check its changes
gh api repos/dnstapir/tapir/commits/18cc0cdb5c33 --jq '.commit.message'

Length of output: 164

README.md (1)

20-20: LGTM! Documentation updates reflect the new terminology.

The README has been updated consistently to use the new terminology throughout the document.

Also applies to: 22-22, 32-34

🧰 Tools
🪛 LanguageTool

[uncategorized] ~20-~20: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... where sources can even be conflicting (eg. a domainname may be flagged by one sour...

(E_G)

rpz.go Show resolved Hide resolved
apihandler.go Show resolved Hide resolved
pop-policy.sample.yaml Show resolved Hide resolved
@zluudg zluudg force-pushed the leon/pop-issue-79/modern-terminology branch from c5eabf4 to 33e5e0c Compare January 14, 2025 15:10
Copy link
Member

@oej oej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good code. Can't write a poem at this moment, so you'll just have be happy with the rabbit.

Copy link
Collaborator

@johanix johanix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
README.md (1)

20-20: Fix abbreviation format.

Change "eg." to "e.g." for correct abbreviation format.

-(eg. a domainname may be flagged by one source but allowlisted by another).
+(e.g., a domainname may be flagged by one source but allowlisted by another).
🧰 Tools
🪛 LanguageTool

[uncategorized] ~20-~20: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... where sources can even be conflicting (eg. a domainname may be flagged by one sour...

(E_G)

policy.go (1)

106-108: Fix comment typos and formatting.

The comment contains typos and spacing issues.

-// Note: we onlygethere when we know that this name is only doubtlisted
-// so no need tocheckfor allow- or denylisting
+// Note: we only get here when we know that this name is only doubtlisted
+// so no need to check for allow- or denylisting
rpz.go (1)

17-24: Consider enhancing algorithm documentation.

The comments could be more descriptive by including:

  • Expected outcomes for each step
  • Error handling scenarios
  • Performance considerations
sources.go (2)

71-106: Consider enhancing error handling structure.

While the error handling is functional, consider consolidating the error checks into a helper function to reduce repetition.

+func (pd *PopData) validatePolicyAction(action string, policyName string) (tapir.Action, error) {
+    result, err := tapir.StringToAction(action)
+    if err != nil {
+        return result, fmt.Errorf("error parsing %s policy: %v", policyName, err)
+    }
+    return result, nil
+}

384-388: Consider restructuring the comment block.

The multi-line comment could be more readable with bullet points:

-// Parse the CNAME (in the shape of a dns.RR) that is found in the RPZ and sort the data into the
-// appropriate list in PopData. Note that there are two special cases:
-//  1. If a "allowlist" RPZ source has a rule with an action other than "rpz-passthru." then that rule doesn't
-//     really belong in a "allowlist" source. So we take that rule an put it in the doubt_catchall bucket instead.
-//  2. If a "{doubt|deny}list" RPZ source has a rule with an "rpz-passthru." (i.e. allowlist) action then that
-//     rule doesn't really belong in a "{doubt|deny}list" source. So we take that rule an put it in the
-//     allow_catchall bucket instead.
+// RpzParseFuncFactory parses CNAME records from RPZ and sorts them into appropriate lists.
+//
+// Special cases:
+// • Allowlist sources:
+//   - Rules with non-passthru actions are moved to doubt_catchall
+// • Denylist/Doubtlist sources:
+//   - Rules with passthru actions are moved to allow_catchall
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c5eabf4 and 15327a3.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (16)
  • README.md (4 hunks)
  • apihandler.go (5 hunks)
  • bootstrap.go (5 hunks)
  • config.go (2 hunks)
  • configupdater.go (1 hunks)
  • lists.go (4 hunks)
  • mqtt.go (1 hunks)
  • policy.go (3 hunks)
  • pop-outputs.sample.yaml (1 hunks)
  • pop-policy.sample.yaml (1 hunks)
  • pop-sources.sample.yaml (1 hunks)
  • reaper.go (1 hunks)
  • refreshengine.go (2 hunks)
  • rpz.go (6 hunks)
  • sources.go (8 hunks)
  • structs.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • mqtt.go
  • pop-outputs.sample.yaml
  • reaper.go
  • configupdater.go
  • config.go
  • structs.go
  • lists.go
  • apihandler.go
  • refreshengine.go
🧰 Additional context used
🪛 LanguageTool
README.md

[uncategorized] ~20-~20: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... where sources can even be conflicting (eg. a domainname may be flagged by one sour...

(E_G)


[style] ~53-~53: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...remely compact data structures. TEM is able to mmap very large lists in DAWG format wh...

(BE_ABLE_TO)


[style] ~53-~53: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...ata structures. TEM is able to mmap very large lists in DAWG format which is used for ...

(EN_WEAK_ADJECTIVE)

🪛 yamllint (1.35.1)
pop-policy.sample.yaml

[error] 8-8: syntax error: found character '\t' that cannot start any token

(syntax)

pop-sources.sample.yaml

[error] 4-4: syntax error: found character '\t' that cannot start any token

(syntax)


[error] 56-56: trailing spaces

(trailing-spaces)


[error] 61-61: trailing spaces

(trailing-spaces)

🔇 Additional comments (12)
pop-policy.sample.yaml (2)

3-18: LGTM! Terminology changes are consistent.

The policy configuration has been updated to use inclusive terminology:

  • "whitelist" → "allowlist"
  • "blacklist" → "denylist"
  • "greylist" → "doubtlist"
🧰 Tools
🪛 yamllint (1.35.1)

[error] 8-8: syntax error: found character '\t' that cannot start any token

(syntax)


7-10: ⚠️ Potential issue

Fix YAML syntax: Replace tabs with spaces.

The YAML file contains tabs which can cause parsing issues. Replace all tabs with spaces for consistent indentation.

-   allowlist:
-      action:		PASSTHRU
-   denylist:
-      action:		NODATA	# present in any denylist->action
+   allowlist:
+      action: PASSTHRU
+   denylist:
+      action: NODATA  # present in any denylist->action

Likely invalid or redundant comment.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 8-8: syntax error: found character '\t' that cannot start any token

(syntax)

pop-sources.sample.yaml (1)

Line range hint 4-74: LGTM! Source names and types are consistently updated.

All source configurations have been updated to use inclusive terminology:

  • Source types changed from whitelist/blacklist/greylist to allowlist/denylist/doubtlist
  • Source names updated (e.g., black_as_sin → sinful, white_and_shiny → shiny)
🧰 Tools
🪛 yamllint (1.35.1)

[error] 4-4: syntax error: found character '\t' that cannot start any token

(syntax)


[error] 56-56: trailing spaces

(trailing-spaces)


[error] 61-61: trailing spaces

(trailing-spaces)

README.md (1)

32-34: LGTM! Documentation accurately reflects new terminology.

The documentation has been thoroughly updated to use inclusive terminology throughout:

  • List types: allowlist, denylist, doubtlist
  • Policy descriptions and examples

Also applies to: 67-74

bootstrap.go (1)

73-73: LGTM! Bootstrap process updated with consistent terminology.

All references to "greylist" have been replaced with "doubtlist" throughout the bootstrap process:

  • API commands
  • Variable names
  • Error messages and logging

Also applies to: 91-91, 95-95, 104-104, 119-123, 142-145, 152-152

policy.go (1)

24-24: LGTM! Policy logic updated with consistent terminology.

The policy implementation has been thoroughly updated to use inclusive terminology:

  • Type definitions
  • Function names (e.g., ComputeRpzDoubtlistAction)
  • Variable names and logging
  • Policy actions and checks

Also applies to: 106-186

rpz.go (3)

30-31: LGTM! Clean terminology update.

The variable renaming is consistent and the map initializations use appropriate capacity hints.

Also applies to: 33-34, 54-55, 57-58


138-138: Previous feedback was acknowledged.

This line was previously flagged but the current implementation was confirmed as intended.


200-201: LGTM! Consistent terminology in conditions and logging.

The ALLOWLIST action checks and logging messages are consistently updated.

Also applies to: 237-240, 257-262

sources.go (3)

51-53: LGTM! Clean list initialization.

The list initialization uses consistent terminology and appropriate map capacities.


134-149: LGTM! Well-structured catchall definitions.

The catchall lists are clearly defined with appropriate descriptions and consistent terminology.


409-409: LGTM! Consistent action handling.

The RPZ action handling logic is consistently updated with the new terminology.

Also applies to: 419-450

pop-sources.sample.yaml Show resolved Hide resolved
@zluudg zluudg force-pushed the leon/pop-issue-79/modern-terminology branch from 15327a3 to 9ae9a13 Compare January 15, 2025 13:18
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
pop-policy.sample.yaml (1)

7-11: Consider adding descriptions for policy actions.

The policy configuration would benefit from comments explaining the implications of each action (PASSTHRU, NODATA).

🧰 Tools
🪛 yamllint (1.35.1)

[error] 8-8: syntax error: found character '\t' that cannot start any token

(syntax)

README.md (2)

20-20: Fix abbreviation format.

Change "eg." to "e.g." for proper abbreviation format.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~20-~20: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... where sources can even be conflicting (eg. a domainname may be flagged by one sour...

(E_G)


72-72: Fix grammar in policy description.

  1. Change "that appear" to "that appears" to match singular subject
  2. Consider rephrasing for clarity:
-the same doubtlisted name that appear in N distinct intelligence feeds
+the same doubtlisted name that appears in N distinct intelligence feeds

Also applies to: 74-74

🧰 Tools
🪛 LanguageTool

[uncategorized] ~72-~72: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...ction. - the same doubtlisted name that appear in N distinct intelligence feeds is i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

policy.go (2)

106-108: Fix comment formatting and typos.

The comment has formatting issues and typos:

-// Note: we onlygethere when we know that this name is only doubtlisted
-// so no need tocheckfor allow- or denylisting
+// Note: we only get here when we know that this name is only doubtlisted
+// so no need to check for allow- or denylisting

148-153: Complete the incomplete comment.

The comment appears to be incomplete:

// Decision to block a doubtlisted name:
// 1. More than N tags present
// 2. Name is present in more than M sources
// 3. Name

The third point is unfinished.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15327a3 and 9ae9a13.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (17)
  • README.md (4 hunks)
  • apihandler.go (5 hunks)
  • bootstrap.go (5 hunks)
  • config.go (2 hunks)
  • configupdater.go (1 hunks)
  • go.mod (1 hunks)
  • lists.go (4 hunks)
  • mqtt.go (1 hunks)
  • policy.go (3 hunks)
  • pop-outputs.sample.yaml (1 hunks)
  • pop-policy.sample.yaml (1 hunks)
  • pop-sources.sample.yaml (1 hunks)
  • reaper.go (1 hunks)
  • refreshengine.go (2 hunks)
  • rpz.go (6 hunks)
  • sources.go (8 hunks)
  • structs.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (10)
  • reaper.go
  • pop-outputs.sample.yaml
  • mqtt.go
  • configupdater.go
  • bootstrap.go
  • rpz.go
  • lists.go
  • structs.go
  • config.go
  • refreshengine.go
🧰 Additional context used
🪛 yamllint (1.35.1)
pop-sources.sample.yaml

[error] 4-4: syntax error: found character '\t' that cannot start any token

(syntax)


[error] 56-56: trailing spaces

(trailing-spaces)


[error] 61-61: trailing spaces

(trailing-spaces)

pop-policy.sample.yaml

[error] 8-8: syntax error: found character '\t' that cannot start any token

(syntax)

🪛 LanguageTool
README.md

[uncategorized] ~20-~20: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... where sources can even be conflicting (eg. a domainname may be flagged by one sour...

(E_G)


[style] ~53-~53: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...remely compact data structures. TEM is able to mmap very large lists in DAWG format wh...

(BE_ABLE_TO)


[style] ~53-~53: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...ata structures. TEM is able to mmap very large lists in DAWG format which is used for ...

(EN_WEAK_ADJECTIVE)


[uncategorized] ~53-~53: A comma might be missing here.
Context: ...s able to mmap very large lists in DAWG format which is used for large allowlists. -...

(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)


[uncategorized] ~72-~72: This verb does not appear to agree with the subject. Consider using a different form.
Context: ...ction. - the same doubtlisted name that appear in N distinct intelligence feeds is i...

(AI_EN_LECTOR_REPLACEMENT_VERB_AGREEMENT)

🔇 Additional comments (11)
pop-policy.sample.yaml (1)

8-8: Fix YAML syntax: Replace tabs with spaces.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 8-8: syntax error: found character '\t' that cannot start any token

(syntax)

pop-sources.sample.yaml (2)

4-4: Fix YAML syntax issues.

🧰 Tools
🪛 yamllint (1.35.1)

[error] 4-4: syntax error: found character '\t' that cannot start any token

(syntax)


35-37: Verify file paths and permissions.

The file paths /var/tmp/dnstapir/sinful.dawg and /var/tmp/dnstapir/sinful.txt should be reviewed for:

  1. Directory permissions
  2. File ownership
  3. Proper path configuration across different environments

Also applies to: 44-44

✅ Verification successful

Sample configuration paths are correctly structured

The paths in pop-sources.sample.yaml are example paths following standard Unix-like filesystem conventions. As this is a sample configuration file, users are expected to customize these paths according to their environment in their actual configuration files.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the paths are consistently used across the codebase
rg -l '/var/tmp/dnstapir'

Length of output: 49


Script:

#!/bin/bash
# Check for README files and documentation
fd README -t f

# Look for other configuration examples or templates
fd -e yaml -e yml

# Search for documentation about paths or configuration
rg -i "configuration|config|path" README.md || true

# Look for environment-specific patterns
rg -i "(/var/|/tmp/|/opt/|path:|directory:)" -A 2 -B 2

Length of output: 2804

policy.go (1)

145-145: Review default return value.

The function returns AllowlistAction as a default. Consider if this is the desired behavior when no conditions are met, as it might be safer to use a more restrictive default.

✅ Verification successful

The default return value is appropriate and intentional

The AllowlistAction default is correct as it's returned only after all blocking conditions (denylist and doubtlist) have been evaluated and not met. This implements a fail-open security model where traffic is allowed unless explicitly blocked by policy rules.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for similar policy computation patterns in the codebase
ast-grep --pattern 'return $_.AllowlistAction'

Length of output: 145


Script:

#!/bin/bash
# Get context around both AllowlistAction returns
echo "=== Context for AllowlistAction returns ==="
rg -B 10 -A 5 "return.*AllowlistAction" policy.go

# Look for other policy action returns
echo -e "\n=== Other policy action returns ==="
rg "return.*Action" policy.go

# Check for documentation about policy actions
echo -e "\n=== Policy action documentation ==="
rg -B 5 "Action.*=|type.*Action" policy.go

Length of output: 2044

sources.go (4)

51-53: LGTM! Terminology changes are consistent.

The changes correctly replace the old terminology with the new one while maintaining the original functionality. The initialization of lists and parsing of policy actions have been updated consistently.

Also applies to: 71-82, 84-85, 90-95, 100-106


134-138: LGTM! Catchall lists are properly renamed.

The initialization of catchall lists has been updated to use the new terminology while preserving the original functionality.

Also applies to: 145-149


318-319: LGTM! Error message updated with new terminology.

The error message for DAWG format validation has been updated to use "allowlist" terminology.


384-388: LGTM! RPZ rule handling updated with new terminology.

The function has been updated to use the new terminology in comments, action constants, list type checks, and warning messages while maintaining the original functionality.

Also applies to: 409-409, 419-420, 423-426, 433-434, 437-440, 443-444, 447-450

apihandler.go (3)

268-275: LGTM! API endpoints updated with new terminology.

The bootstrap API endpoints and their corresponding logic have been updated to use "doubtlist" terminology while maintaining the original functionality.

Also applies to: 277-288, 293-296, 298-298, 306-306, 308-308, 310-310, 326-326, 329-329, 332-332, 340-340, 346-348, 350-350


268-275: Add request validation for doubtlist-status.

The doubtlist-status command handler should validate the ListName parameter before processing.

 case "doubtlist-status":
+    if bp.ListName == "" {
+        resp.Error = true
+        resp.ErrorMsg = "ListName parameter is required"
+        return
+    }
     me := conf.PopData.MqttEngine
     stats := me.Stats()

436-436: LGTM! Debug endpoints updated with new terminology.

The debug API endpoints and their corresponding logic have been updated to use the new terminology while maintaining the original functionality.

Also applies to: 445-446, 472-473

pop-sources.sample.yaml Show resolved Hide resolved
go.mod Show resolved Hide resolved
@johanix johanix merged commit d4d62b9 into dnstapir:main Jan 15, 2025
2 checks passed
@morkrost
Copy link

Regarding the ListType field in the MQTT message, it is really confusing that it coincides with the list types expected to be generated locally. CORE should not have much of an opinion on what you do with the observations we send, so the terminology really should be different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modernize terminology for domain lists
4 participants