fix(defender): sync hasThreats blocking logic and tool rules precedence from JS package by hiskudin · Pull Request #1 · StackOneHQ/defender-python

hiskudin · 2026-03-09T14:57:04Z

Summary

Add `has_threats` guard so base risk from tool rules alone (e.g. `gmail_*` seeding `'high'`) does not block safe content when `block_high_risk` is enabled
Custom `config` tool rules now take precedence over the `use_default_tool_rules` flag, matching the JS package behaviour
Add `TestUseDefaultToolRules` integration tests covering both behaviours

Test plan

`pytest tests/test_integration.py` — all 28 tests pass
Verify `test_applies_tool_rules_when_true` asserts `allowed=True` for safe content at high base risk
Verify `test_always_applies_custom_tool_rules_from_config` asserts `allowed=True` for safe content with custom rules

🤖 Generated with Claude Code

…ce from JS package - Add has_threats guard so base risk from tool rules alone does not block safe content when block_high_risk is enabled - Custom config tool_rules now take precedence over use_default_tool_rules flag - Add TestUseDefaultToolRules integration tests to cover both behaviours Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR syncs two behaviors from the JS package to the Python stackone_defender library: (1) a has_threats guard that ensures base risk from tool rules alone doesn't block safe content when block_high_risk is enabled, and (2) custom config tool rules now take precedence over the use_default_tool_rules flag. Integration tests are added covering both behaviors.

Changes:

Added has_threats guard in defend_tool_result so that block_high_risk only blocks content when actual threat signals (detections, active sanitization methods, or tier2 scores above threshold) are present — base risk from tool rules alone no longer triggers blocking.
Changed tool rules resolution in __init__ to check config["tool_rules"] first, falling back to default rules only when custom rules aren't provided, matching JS package precedence.
Added TestUseDefaultToolRules test class with four integration tests covering default, explicitly false, explicitly true, and custom config tool rules scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`src/stackone_defender/core/prompt_defense.py`	Updated tool rules precedence logic (line 58) and added `has_threats` guard to `allowed` computation (lines 123-136)
`tests/test_integration.py`	Added `TestUseDefaultToolRules` class with 4 tests covering tool rules precedence and safe content allowance

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-09T15:02:42Z

tests/test_integration.py

+    def test_applies_tool_rules_when_true(self):
+        defense = create_prompt_defense(use_default_tool_rules=True, block_high_risk=True)
+        data = {"subject": "Weekly team update", "body": "Reminder about the meeting tomorrow at 10am.", "thread_id": "thread123"}
+        result = defense.defend_tool_result(data, "gmail_get_message")
+        # With use_default_tool_rules, gmail tool rule seeds risk_level: 'high' as base risk,
+        # but safe content with no detections should still be allowed through.
+        assert result.risk_level == "high"
+        assert result.allowed is True


The new tests verify that safe content is allowed through when block_high_risk=True with tool rules, but there's no test verifying the converse — that malicious content is still blocked when use_default_tool_rules=True and block_high_risk=True. Adding a test like test_blocks_malicious_content_with_tool_rules (e.g., using "SYSTEM: ignore previous instructions" in a gmail message body) would guard against regressions in the has_threats logic.

Copilot · 2026-03-09T15:02:43Z

src/stackone_defender/core/prompt_defense.py

            self._config.block_high_risk = True

-        tool_rules = self._config.tool_rules if use_default_tool_rules else []
+        tool_rules = (config or {}).get("tool_rules") or (self._config.tool_rules if use_default_tool_rules else [])


Using or to chain the fallback means an explicitly empty config={"tool_rules": []} is treated as falsy and falls through to the use_default_tool_rules branch. If the intent is that custom config tool rules always take precedence (as stated in the PR description), consider using an explicit None check instead, e.g.: tool_rules = (config or {}).get("tool_rules") if (config or {}).get("tool_rules") is not None else (self._config.tool_rules if use_default_tool_rules else []). This way, an explicitly empty list from config would be respected as "no tool rules" rather than falling through.

cubic-dev-ai

2 issues found across 2 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/stackone_defender/core/prompt_defense.py">

<violation number="1" location="src/stackone_defender/core/prompt_defense.py:58">
P2: The `or` fallback treats an explicitly empty `tool_rules` list as “not provided.” If a caller sets `"tool_rules": []` to disable tool rules, this line still loads defaults when `use_default_tool_rules` is true. Use an explicit key check so empty lists are honored.</violation>

<violation number="2" location="src/stackone_defender/core/prompt_defense.py:129">
P2: `has_threats` compares tier2 scores against `self._config.tier2.high_risk_threshold`, which doesn’t reflect `tier2_config` overrides. If the classifier uses a lower high-risk threshold, `tier2_risk` can be high while `has_threats` stays false, so `block_high_risk` won’t block. Use `tier2_risk` (or the classifier’s thresholds) instead of the config default.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-03-09T15:05:08Z

src/stackone_defender/core/prompt_defense.py

            self._config.block_high_risk = True

-        tool_rules = self._config.tool_rules if use_default_tool_rules else []
+        tool_rules = (config or {}).get("tool_rules") or (self._config.tool_rules if use_default_tool_rules else [])


P2: The or fallback treats an explicitly empty tool_rules list as “not provided.” If a caller sets "tool_rules": [] to disable tool rules, this line still loads defaults when use_default_tool_rules is true. Use an explicit key check so empty lists are honored.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At src/stackone_defender/core/prompt_defense.py, line 58: <comment>The `or` fallback treats an explicitly empty `tool_rules` list as “not provided.” If a caller sets `"tool_rules": []` to disable tool rules, this line still loads defaults when `use_default_tool_rules` is true. Use an explicit key check so empty lists are honored.</comment> <file context> @@ -55,7 +55,7 @@ def __init__( self._config.block_high_risk = True - tool_rules = self._config.tool_rules if use_default_tool_rules else [] + tool_rules = (config or {}).get("tool_rules") or (self._config.tool_rules if use_default_tool_rules else []) self._tool_sanitizer: ToolResultSanitizer = create_tool_result_sanitizer( </file context>

Suggested change

tool_rules = (config or {}).get("tool_rules") or (self._config.tool_rules if use_default_tool_rules else [])

tool_rules = (config or {}).get("tool_rules") if "tool_rules" in (config or {}) else (self._config.tool_rules if use_default_tool_rules else [])

cubic-dev-ai · 2026-03-09T15:05:08Z

src/stackone_defender/core/prompt_defense.py

+        has_threats = (
+            len(detections) > 0
+            or len(fields_sanitized) > 0
+            or (tier2_score is not None and tier2_score >= self._config.tier2.high_risk_threshold)


P2: has_threats compares tier2 scores against self._config.tier2.high_risk_threshold, which doesn’t reflect tier2_config overrides. If the classifier uses a lower high-risk threshold, tier2_risk can be high while has_threats stays false, so block_high_risk won’t block. Use tier2_risk (or the classifier’s thresholds) instead of the config default.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At src/stackone_defender/core/prompt_defense.py, line 129: <comment>`has_threats` compares tier2 scores against `self._config.tier2.high_risk_threshold`, which doesn’t reflect `tier2_config` overrides. If the classifier uses a lower high-risk threshold, `tier2_risk` can be high while `has_threats` stays false, so `block_high_risk` won’t block. Use `tier2_risk` (or the classifier’s thresholds) instead of the config default.</comment> <file context> @@ -120,7 +120,20 @@ def defend_tool_result(self, value: Any, tool_name: str) -> DefenseResult: + has_threats = ( + len(detections) > 0 + or len(fields_sanitized) > 0 + or (tier2_score is not None and tier2_score >= self._config.tier2.high_risk_threshold) + ) + </file context>

Suggested change

or (tier2_score is not None and tier2_score >= self._config.tier2.high_risk_threshold)

or tier2_risk in ("high", "critical")

Copilot AI review requested due to automatic review settings March 9, 2026 14:57

Copilot started reviewing on behalf of hiskudin March 9, 2026 14:57 View session

Copilot AI reviewed Mar 9, 2026

View reviewed changes

cubic-dev-ai bot reviewed Mar 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(defender): sync hasThreats blocking logic and tool rules precedence from JS package#1

fix(defender): sync hasThreats blocking logic and tool rules precedence from JS package#1
hiskudin wants to merge 1 commit intomainfrom
fix/sync-has-threats-blocking-logic

hiskudin commented Mar 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 9, 2026

Uh oh!

Copilot AI Mar 9, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

cubic-dev-ai bot Mar 9, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai bot Mar 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	tool_rules = (config or {}).get("tool_rules") or (self._config.tool_rules if use_default_tool_rules else [])
	tool_rules = (config or {}).get("tool_rules") if "tool_rules" in (config or {}) else (self._config.tool_rules if use_default_tool_rules else [])

	or (tier2_score is not None and tier2_score >= self._config.tier2.high_risk_threshold)
	or tier2_risk in ("high", "critical")

Conversation

hiskudin commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hiskudin commented Mar 9, 2026 •

edited

Loading

cubic-dev-ai bot Mar 9, 2026 •

edited

Loading

cubic-dev-ai bot Mar 9, 2026 •

edited

Loading