← Back to Course Summary / 返回课程总结
WEEK 6

Security & Vulnerabilities
安全与漏洞修复

Scan, Identify, and Fix Security Vulnerabilities
扫描、识别和修复安全漏洞

Learn to use Semgrep for static analysis, identify OWASP Top 10 vulnerabilities, and implement security fixes including SQL injection prevention, XSS protection, and secure cryptography.

学习使用Semgrep进行静态分析,识别OWASP十大漏洞,并实施安全修复,包括SQL注入防护、XSS保护和安全加密。

Hands-On Security Tutorial / 实践安全教程

🎯 Week 6 Project / 本周项目
Perform a complete security audit on the Developer's Command Center application, identify 8 vulnerabilities using Semgrep, and implement fixes for SQL injection, XSS, weak cryptography, and more.

对开发者控制中心应用执行完整的安全审计,使用Semgrep识别8个漏洞,并实施SQL注入、XSS、弱加密等修复。

Vulnerability #1: SQL Injection / 漏洞#1:SQL注入

SQL Injection (CWE-89) ✓ FIXED
Location: backend/app/routers/notes.py lines 69-92
Risk: Attackers can manipulate database queries to exfiltrate, modify, or delete data
风险:攻击者可以操纵数据库查询来泄露、修改或删除数据

❌ Vulnerable Code / 易受攻击代码

@router.get("/unsafe-search", response_model=list[NoteRead])
def unsafe_search(q: str, db: Session = Depends(get_db)):
    sql = text(
        f"""
        SELECT id, title, content, created_at, updated_at
        FROM notes
        WHERE title LIKE '%{q}%' OR content LIKE '%{q}%'
        ORDER BY created_at DESC
        LIMIT 50
        """
    )
    rows = db.execute(sql).all()
    return [NoteRead.model_validate(row) for row in rows]

✅ Fixed Code / 修复代码

@router.get("/safe-search", response_model=list[NoteRead])
def safe_search(q: str, db: Session = Depends(get_db)):
    """Search notes using parameterized query to prevent SQL injection."""
    from sqlalchemy import bindparam

    # Use parameterized query with bindparam - prevents SQL injection
    sql = text(
        """
        SELECT id, title, content, created_at, updated_at
        FROM notes
        WHERE title LIKE :search_pattern OR content LIKE :search_pattern
        ORDER BY created_at DESC
        LIMIT 50
        """
    )
    search_pattern = f"%{q}%"
    rows = db.execute(sql, {"search_pattern": search_pattern}).all()
    return [NoteRead.model_validate(row) for row in rows]

💡 Why This Works / 为什么有效

Parameterized queries (prepared statements) separate SQL code from data. The database treats the bound parameter as a literal value, not as executable SQL code. This completely eliminates SQL injection because:

参数化查询(预准备语句)将SQL代码与数据分离。数据库将绑定参数视为字面值,而非可执行的SQL代码。这完全消除了SQL注入,因为:

1. User input is never part of SQL syntax - The query structure is fixed
2. Database handles escaping - Special characters are properly escaped
3. Context is preserved - Parameters are always treated as values, not SQL keywords

1. 用户输入从不成为SQL语法的一部分 - 查询结构是固定的
2. 数据库处理转义 - 特殊字符被正确转义
3. 保留上下文 - 参数始终被视为值,而非SQL关键字

Vulnerability #2: Weak Cryptography / 漏洞#2:弱加密

Weak Cryptographic Hash (CWE-327) ✓ FIXED
Location: backend/app/routers/notes.py lines 95-99
Risk: MD5 is cryptographically broken, vulnerable to collision and pre-image attacks
风险:MD5在密码学上已被破解,容易受到碰撞和预图像攻击

❌ Vulnerable Code / 易受攻击代码

@router.get("/debug/hash-md5")
def debug_hash_md5(q: str) -> dict[str, str]:
    import hashlib
    return {"algo": "md5", "hex": hashlib.md5(q.encode()).hexdigest()}

✅ Fixed Code / 修复代码

@router.get("/debug/hash-sha256")
def debug_hash_sha256(q: str, add_salt: bool = False) -> dict[str, str]:
    """Hash input using SHA-256 with optional salt."""
    import hashlib
    import secrets

    if add_salt:
        # Generate a cryptographically secure random salt
        salt = secrets.token_hex(16)
        value = q.encode() + salt.encode()
        hex_digest = hashlib.sha256(value).hexdigest()
        return {
            "algo": "sha256",
            "hex": hex_digest,
            "salted": True,
            "salt": salt,
        }
    else:
        # SHA-256 is much stronger than MD5
        hex_digest = hashlib.sha256(q.encode()).hexdigest()
        return {"algo": "sha256", "hex": hex_digest, "salted": False}

💡 Why This Works / 为什么有效

SHA-256 is part of the SHA-2 family approved by NIST. It has no practical collision attacks due to its 256-bit output. The optional salt prevents rainbow table attacks.

Note: For password hashing, use bcrypt, scrypt, or argon2 instead, as they're designed to be slow (computationally expensive) to thwart brute-force attacks.

SHA-256是NIST批准的SHA-2系列的一部分。由于256位输出,它没有实际的碰撞攻击。可选的盐防止彩虹表攻击。

注意:对于密码哈希,请改用bcrypt、scrypt或argon2,因为它们被设计得很慢(计算量大),以阻止暴力破解攻击。

Vulnerability #3: Cross-Site Scripting (XSS) / 漏洞#3:跨站脚本

Cross-Site Scripting (CWE-79) ✓ FIXED
Location: frontend/app.js line 14
Risk: Malicious scripts execute in users' browsers, stealing cookies or performing actions
风险:恶意脚本在用户浏览器中执行,窃取cookie或执行操作

❌ Vulnerable Code / 易受攻击代码

async function loadNotes(params = {}) {
  const list = document.getElementById('notes');
  list.innerHTML = '';
  const notes = await fetchJSON('/notes/?' + new URLSearchParams(params));
  for (const n of notes) {
    const li = document.createElement('li');
    li.innerHTML = `${n.title}: ${n.content}`;  // ⚠️ XSS vulnerability
    list.appendChild(li);
  }
}

✅ Fixed Code / 修复代码

async function loadNotes(params = {}) {
  const list = document.getElementById('notes');
  list.innerHTML = '';
  const notes = await fetchJSON('/notes/?' + new URLSearchParams(params));
  for (const n of notes) {
    const li = document.createElement('li');

    // Use textContent instead of innerHTML to prevent XSS
    const titleSpan = document.createElement('strong');
    titleSpan.textContent = n.title;  // ✅ Safe - automatic escaping
    li.appendChild(titleSpan);

    const contentSpan = document.createTextNode(`: ${n.content}`);
    li.appendChild(contentSpan);

    list.appendChild(li);
  }
}

💡 Why This Works / 为什么有效

Using textContent and createTextNode provides automatic HTML escaping. Special characters like <, >, & are converted to HTML entities (&lt;, &gt;, &amp;).

Example: User input <script>alert('XSS')</script> with innerHTML executes the script. With textContent, it displays literally as text.

使用textContentcreateTextNode提供自动HTML转义。特殊字符如<>&被转换为HTML实体。

示例:用户输入<script>alert('XSS')</script>,使用innerHTML会执行脚本。使用textContent时,它会按字面显示为文本。

OWASP Top 10 Coverage / OWASP十大覆盖

A01
Broken Access Control
Path traversal vulnerabilities identified and documented
A03
Injection
SQL injection and command injection fixed with parameterized queries
A02
Cryptographic Failures
Weak MD5 hashing replaced with SHA-256 and salting
A08
Software & Data Integrity
XSS vulnerabilities patched with safe DOM APIs

Security Best Practices / 安全最佳实践

✓ Parameterized Queries / 参数化查询

Always use parameterized queries or ORM methods to prevent SQL injection.

始终使用参数化查询或ORM方法来防止SQL注入。

✓ Input Validation / 输入验证

Never trust user input. Always validate, sanitize, and use safe APIs.

永远不要信任用户输入。始终验证、清理并使用安全API。

✓ Safe DOM APIs / 安全DOM API

Use textContent instead of innerHTML for user-generated content.

对用户生成内容使用textContent而非innerHTML。

✓ Strong Cryptography / 强加密

Use modern algorithms: SHA-256, bcrypt, argon2. Avoid MD5, SHA1.

使用现代算法:SHA-256、bcrypt、argon2。避免MD5、SHA1。

✓ Defense in Depth / 纵深防御

Apply multiple layers of security controls for comprehensive protection.

应用多层安全控制以实现全面保护。

✓ Regular Scanning / 定期扫描

Use Semgrep and SAST tools to catch vulnerabilities early.

使用Semgrep和SAST工具尽早发现漏洞。

Using Semgrep / 使用Semgrep

1
Install Semgrep / 安装Semgrep
pip install semgrep
2
Run Security Scan / 运行安全扫描
# Scan for Python security issues semgrep --config=auto backend/ # Scan specific rules semgrep --config=python.security backend/ # Generate SARIF report semgrep --config=auto --output=report.sarif --json backend/
3
Review Findings / 审查发现
Semgrep will output findings with file locations, line numbers, and remediation advice. Prioritize critical and high-severity issues first.

Semgrep将输出包含文件位置、行号和修复建议的发现。优先处理关键和高严重性问题。

Achievements / 成就

  • ✅ Performed complete security audit using Semgrep
  • ✅ Identified 8 security vulnerabilities across SAST categories
  • ✅ Fixed SQL injection with parameterized queries
  • ✅ Replaced weak MD5 with SHA-256 + salting
  • ✅ Patched XSS vulnerabilities with safe DOM APIs
  • ✅ Documented OWASP Top 10 coverage
  • ✅ Implemented security best practices
  • ✅ 使用Semgrep执行完整安全审计
  • ✅ 识别了SAST类别的8个安全漏洞
  • ✅ 使用参数化查询修复SQL注入
  • ✅ 用SHA-256+盐替换弱MD5
  • ✅ 使用安全DOM API修补XSS漏洞
  • ✅ 记录了OWASP十大覆盖
  • ✅ 实施了安全最佳实践