AI Code Review for Security: A PR Checklist for Auth, Tenant Isolation, Validation, and Secrets

AI code review security is the practice of reviewing AI-assisted pull requests for security regressions before they merge. The checklist should catch newly introduced vulnerabilities, but it also needs to catch something easier to miss: security controls that used to exist and disappeared in the diff.

That second category is where AI-assisted refactors get dangerous.

A PR can look clean. Tests can pass. The route can still return data. But the new version may have quietly removed:

  • an auth decorator
  • a permission branch
  • a tenant filter
  • input validation
  • rate limiting
  • CSRF protection
  • audit logging
  • secret handling

That is why teams using Cursor, Claude Code, GitHub Copilot, Codex, or other AI coding tools need a PR security checklist that starts with the diff, not just the final file state.

If you want a local-first tool for this workflow, the open-source Skylos CLI is here: github.com/duriantaco/skylos.


TL;DR: the AI code review security checklist

Use this checklist before approving an AI-generated or AI-assisted PR:

CheckWhat to look forWhy it matters
Auth still existsDeleted decorators, middleware, guards, session checks, or role checksMissing auth is often a broken access control issue
Tenant scoping still existsRemoved tenant_id, org_id, workspace, account, or owner filtersCross-tenant data leaks often look like normal query refactors
Validation still runsValidation moved after use, deleted schemas, weaker serializer checksValid shape is not the same as safe input
Rate limits still applyRemoved rate-limit decorators or gateway configLogin, export, webhook, and AI endpoints become abuse surfaces
CSRF and CORS did not weakenRemoved CSRF middleware, wider CORS origins, disabled same-site policyBrowser-facing routes can become exploitable without code looking obviously dangerous
Audit logging still records sensitive actionsDeleted security logs around admin, billing, export, auth, or permission changesIncident response gets worse when logs disappear
Secrets were not addedAPI keys, tokens, passwords, private URLs, debug configAI tools often fill examples with realistic placeholders
Imports and calls are realHallucinated packages, phantom helper functions, nonexistent APIsPlausible names can hide broken or unsafe code paths
Negative tests existTests for unauthenticated, unauthorized, cross-tenant, invalid input, and rate-limit behaviorHappy-path tests can pass while controls are gone
CI blocks high-confidence regressionsStatic analysis, diff-aware scan, and PR gateReviewer memory does not scale with AI-generated volume

Why AI-assisted PRs fail differently

AI coding tools make it cheap to generate large, plausible changes. That changes the review problem.

Before AI, the reviewer usually asked:

What bug did this developer introduce?

With AI-generated code, the reviewer also has to ask:

What protection did this refactor delete?

The adoption numbers explain why this is becoming a normal engineering problem, not a niche concern. Stack Overflow's 2025 Developer Survey reported that 84% of developers use or plan to use AI tools, while 46% said they do not trust the accuracy of AI-tool output. Veracode's research on AI-generated code found that only 55% of generated code was secure across 80 coding tasks.

Those numbers do not mean every AI-generated PR is unsafe. They mean human review and automated gates need to adapt.

The highest-risk misses are not always obvious injection sinks or eval() calls. They are often deleted controls in ordinary application code.

OWASP's 2025 Top 10 still lists Broken Access Control as A01. That maps directly to the failures this checklist is trying to catch: missing permission checks, IDOR-style data access, and APIs that expose protected operations without the expected server-side enforcement.


A concrete example: tenant isolation removed by a clean refactor

This is the kind of AI-assisted PR that can pass a quick review.

Before:

@router.get("/customers/{customer_id}")
async def get_customer(
    customer_id: str,
    user: User = Depends(require_user),
    db: Session = Depends(get_db),
):
    return (
        db.query(Customer)
        .filter(Customer.id == customer_id)
        .filter(Customer.tenant_id == user.tenant_id)
        .one()
    )

After:

@router.get("/customers/{customer_id}")
async def get_customer(
    customer_id: str,
    db: Session = Depends(get_db),
):
    return (
        db.query(Customer)
        .filter(Customer.id == customer_id)
        .one()
    )

The new code is shorter.

The endpoint still works.

A happy-path test can still pass.

But two controls are gone:

  • require_user
  • Customer.tenant_id == user.tenant_id

That is no longer the same security model. It is now possible for the route to return a customer record without proving the caller belongs to the same tenant.

This is why AI code review security needs to prioritize before-and-after evidence. The danger is not only what the new code contains. It is what the new code no longer contains.


The checklist in detail

1. Did auth or permission checks disappear?

Start here. Do not read helper extraction first. Do not review formatting first.

Search the diff for removed:

  • @login_required
  • @permission_required
  • Depends(require_user)
  • current_user
  • is_admin
  • has_permission
  • can_read
  • can_write
  • require_role
  • require_scope

Then ask:

  • Does every protected route still prove who the caller is?
  • Does every sensitive action still prove what the caller can do?
  • Did the check move to a shared layer, or did it simply disappear?

GitHub's Copilot code review documentation is explicit that Copilot is not guaranteed to catch every PR issue and should be supplemented with human review. That is the right mental model for all AI review systems: useful assistant, not final security authority.

2. Did tenant scoping disappear?

Tenant scoping is one of the easiest controls to lose in a refactor because it often looks like "query cleanup".

Look for deleted filters involving:

  • tenant_id
  • org_id
  • organization_id
  • workspace_id
  • account_id
  • team_id
  • owner_id
  • project_id

High-risk patterns:

# Before
query = query.filter(Document.tenant_id == user.tenant_id)

# After
query = query.filter(Document.id == document_id)
// Before
await db.invoice.findFirst({
  where: { id, organizationId: session.organizationId },
});

// After
await db.invoice.findFirst({
  where: { id },
});

If a PR touches a query for customer data, billing records, exports, files, admin actions, or AI context retrieval, tenant scoping should be reviewed as a first-class security control.

3. Did validation move after the dangerous operation?

AI refactors often move logic into helpers. That is fine until validation moves after the value is used.

Look for:

  • request body schemas removed
  • serializer validation weakened
  • allow_any style configs added
  • regex or allowlist validation deleted
  • URL validation replaced with plain string checks
  • file extension checks replacing content-type checks

Bad pattern:

def import_webhook(payload: dict):
    process_webhook(payload)
    validate_webhook(payload)

The validation exists, but it is too late.

4. Did rate limiting disappear?

Rate limits are easy to delete because they often look like non-business logic.

Review this especially on:

  • login
  • password reset
  • signup
  • token refresh
  • export endpoints
  • webhook endpoints
  • expensive search
  • AI chat or agent endpoints

Look for removed:

  • @rate_limit
  • limiter.limit
  • throttle
  • gateway annotations
  • queue caps
  • token or spend caps

For LLM-powered products, rate limits are both a security control and a cost-control boundary.

5. Did CSRF, CORS, or browser security weaken?

This matters when the PR touches routes, middleware, cookies, sessions, forms, or frontend-backed APIs.

Look for:

  • CSRF middleware removed
  • csrf_exempt added
  • CORS widened from one origin to *
  • cookies changed away from HttpOnly, Secure, or SameSite
  • security headers removed
  • auth moved from server-side cookies to local storage without a threat model

Not every API needs CSRF protection in the same way. But every weakening should be intentional and documented.

6. Did audit logging disappear?

Deleted logging can be a security regression even when runtime behavior is unchanged.

Sensitive actions should usually leave an audit trail:

  • admin role changes
  • permission changes
  • billing changes
  • customer data exports
  • API key creation
  • password reset
  • user impersonation
  • data deletion
  • security policy changes

If the PR removes an audit event, the reviewer should ask why.

7. Did the PR add secrets or fake examples?

AI tools often produce realistic examples:

OPENAI_API_KEY = "sk-proj-example"
DATABASE_URL = "postgres://admin:password@prod-db.internal/app"
JWT_SECRET = "change-me"

Even placeholder-looking secrets are a problem because placeholders get copied, reused, and forgotten.

For local scanning, Skylos can run danger and quality checks before the PR gets reviewed:

pipx install skylos
skylos . --danger --quality

The OSS repo is here: github.com/duriantaco/skylos.

8. Did the AI hallucinate imports, packages, or helpers?

This is common in Python and JavaScript/TypeScript.

Look for:

  • packages not in requirements.txt, pyproject.toml, package.json, or lockfiles
  • helper functions with plausible names but no definition
  • library methods that do not exist in your installed version
  • code comments saying "assume this exists"

This is both a reliability problem and a supply-chain problem. A hallucinated import can become a malicious dependency risk if someone installs a similarly named package just to make the code run.

For a deeper workflow, see How to Verify AI-Generated Python Code and Catch Hallucinated Imports.

9. Did tests cover the security-negative path?

Happy-path tests are not enough for AI-assisted PRs.

For a protected route, require tests for:

  • unauthenticated request
  • authenticated but unauthorized request
  • cross-tenant request
  • invalid input
  • missing required scope
  • rate-limit behavior where relevant

The test that matters for the earlier tenant example is not:

def test_get_customer_returns_200():
    ...

It is:

def test_get_customer_rejects_cross_tenant_access():
    ...

10. Is there a CI gate, or is this all reviewer memory?

Manual review is a weak control if the same issue appears every week.

Use the manual checklist to learn the patterns. Then move the high-confidence checks into automation.

For Skylos:

skylos . --danger --quality
skylos defend .
skylos debt .

For pull requests, use diff-aware scanning where the reviewer needs to know what changed:

skylos . --danger --diff origin/main

For a full local bundle:

skylos suite .

To upload suite results to Skylos Cloud as separate scan families:

skylos suite . --upload

If you want the GitHub Actions workflow, see Python Security Scanner for GitHub Actions.


Where static analysis fits

Static analysis is still necessary. The question is what job each tool is doing.

CodeQL is strong for semantic analysis and GitHub code scanning. Semgrep taint analysis is useful for tracking untrusted data from sources to sinks. Snyk, SonarQube, and other AppSec tools cover broader platform workflows.

But AI code review security adds another question:

Did this PR remove a security control that used to exist?

That is why diff-aware scanning matters. A whole-repo scan can tell you what exists now. A diff-aware review can tell you what disappeared.

Skylos is designed to sit beside those tools as a local-first PR guardrail, especially for:

  • removed auth checks
  • tenant-scoping regressions
  • dead code and cleanup risk
  • AI-generated code verification
  • LLM app defense checks
  • technical debt hotspots
  • CI and optional cloud upload workflows

For a broader tool comparison, read Best AI Code Security Tools in 2026 Compared.


The minimum review policy for AI-assisted PRs

If you want a simple policy, use this:

  1. Any PR touching auth, billing, admin, exports, tenants, files, webhooks, or AI endpoints gets security review.
  2. The reviewer checks deleted controls before reviewing new helper code.
  3. Cross-tenant negative tests are required for tenant-scoped data.
  4. High-confidence static findings block merge.
  5. Diff-aware regression findings block merge.
  6. Reviewer comments that repeat twice become CI rules.

That last point matters. The goal is not to turn senior engineers into permanent AI babysitters. The goal is to convert repeated review pain into automated gates.


A Dev.to-friendly summary

If you cross-post this to Dev.to, use the canonical URL back to the Skylos site and lead with the concrete failure mode:

AI code review is not just "find the new bug." The scarier failure mode is a clean refactor that removes an auth check, tenant filter, validation branch, rate limit, or audit log while tests still pass.

Suggested Dev.to tags:

  • ai
  • security
  • opensource
  • codequality

DEV supports canonical URLs for cross-posting, so set the canonical URL to:

https://skylos.dev/blog/ai-code-review-security-pr-checklist

Final takeaway

AI code review security is not about distrusting every AI-generated line. It is about recognizing that AI-assisted PRs create a new review shape.

The risky change may be the thing that disappeared.

Start with the manual checklist:

  • auth still exists
  • tenant scoping still exists
  • validation still runs before use
  • rate limits still apply
  • CSRF, CORS, cookies, and headers did not weaken
  • audit logging still records sensitive actions
  • secrets were not added
  • imports and helpers are real
  • negative tests cover the security path

Then enforce the repeatable parts with local scanning and CI.

Skylos is open source and local-first. Try it here: github.com/duriantaco/skylos.

pipx install skylos
skylos . --danger --quality
skylos . --danger --diff origin/main

If your team wants one command for static analysis, AI defense, technical debt, and optional cloud upload:

skylos suite .
skylos suite . --upload

Sources