Checklist • AI Code Securitysecurityai codecode-review

AI Code Review for Security: A PR Checklist for Auth, Tenant Isolation, Validation, and Secrets

A practical AI code review security checklist for PRs: catch removed auth, broken tenant isolation, missing validation, rate limits, audit logs, and secrets before merge.

Skylos team• Product engineering and research

May 2, 2026

12 min read

About this page

AI coding tools changed the review failure mode. The highest-risk PR issue is not always a new vulnerability. It can be a deleted auth check, tenant filter, validation branch, rate limit, CSRF check, or audit log line that disappeared during a plausible refactor.

3 notes

Mapped current search intent around AI code review security, AI-generated code security, AI pull request security, and secure AI-generated code.
Used current public sources from Stack Overflow, Veracode, OWASP, GitHub, Semgrep, and CodeQL to anchor the risk model.
Focused the checklist on controls that can disappear in PR diffs: auth, authorization, tenant scoping, validation, rate limiting, CSRF, audit logging, and secrets handling.

Quick answer

A practical AI code review security checklist for PRs: catch removed auth, broken tenant isolation, missing validation, rate limits, audit logs, and secrets before merge.

AI code review security should check both newly introduced vulnerabilities and security controls that disappeared from the diff.
The highest-risk AI-generated PRs often touch auth, tenant scoping, validation, rate limits, CSRF, audit logging, or data export paths.
Use manual review prompts first, then enforce the same checks with local static analysis and CI gates.

Step 1
Review deleted security controls first
Before reading new helper code, check whether the diff removed auth, authorization, tenant scoping, validation, rate limiting, CSRF protection, security middleware, or audit logging.
Step 2
Run local security checks
Run a local scanner before the PR reaches a human reviewer so obvious danger, quality, and AI-code issues do not consume review time.
Step 3
Gate the pull request
Move the same checklist into CI so removed controls and high-confidence security regressions block merges instead of relying on reviewer memory.

Review AI-generated code

Catch AI-introduced regressions before they land

Run Skylos locally or in CI to surface hallucinated imports, removed auth checks, secrets, and risky defaults in AI-generated diffs.

Run Skylos on your repo See the AI PR workflow

Navigation

Jump to a section

01TL;DR: the AI code review security checklist 02Why AI-assisted PRs fail differently 03A concrete example: tenant isolation removed by a clean refactor 04The checklist in detail 051. Did auth or permission checks disappear?062. Did tenant scoping disappear?073. Did validation move after the dangerous operation?084. Did rate limiting disappear?095. Did CSRF, CORS, or browser security weaken?106. Did audit logging disappear?117. Did the PR add secrets or fake examples?128. Did the AI hallucinate imports, packages, or helpers?139. Did tests cover the security-negative path?1410. Is there a CI gate, or is this all reviewer memory?15Where static analysis fits 16The minimum review policy for AI-assisted PRs 17A Dev.to-friendly summary 18Final takeaway 19Sources

AI Code Review for Security: A PR Checklist for Auth, Tenant Isolation, Validation, and Secrets

AI code review security is the practice of reviewing AI-assisted pull requests for security regressions before they merge. The checklist should catch newly introduced vulnerabilities, but it also needs to catch something easier to miss: security controls that used to exist and disappeared in the diff.

That second category is where AI-assisted refactors get dangerous.

A PR can look clean. Tests can pass. The route can still return data. But the new version may have quietly removed:

an auth decorator
a permission branch
a tenant filter
input validation
rate limiting
CSRF protection
audit logging
secret handling

That is why teams using Cursor, Claude Code, GitHub Copilot, Codex, or other AI coding tools need a PR security checklist that starts with the diff, not just the final file state.

If you want a local-first tool for this workflow, the open-source Skylos CLI is here: github.com/duriantaco/skylos.

TL;DR: the AI code review security checklist

Use this checklist before approving an AI-generated or AI-assisted PR:

Check	What to look for	Why it matters
Auth still exists	Deleted decorators, middleware, guards, session checks, or role checks	Missing auth is often a broken access control issue
Tenant scoping still exists	Removed `tenant_id`, `org_id`, workspace, account, or owner filters	Cross-tenant data leaks often look like normal query refactors
Validation still runs	Validation moved after use, deleted schemas, weaker serializer checks	Valid shape is not the same as safe input
Rate limits still apply	Removed rate-limit decorators or gateway config	Login, export, webhook, and AI endpoints become abuse surfaces
CSRF and CORS did not weaken	Removed CSRF middleware, wider CORS origins, disabled same-site policy	Browser-facing routes can become exploitable without code looking obviously dangerous
Audit logging still records sensitive actions	Deleted security logs around admin, billing, export, auth, or permission changes	Incident response gets worse when logs disappear
Secrets were not added	API keys, tokens, passwords, private URLs, debug config	AI tools often fill examples with realistic placeholders
Imports and calls are real	Hallucinated packages, phantom helper functions, nonexistent APIs	Plausible names can hide broken or unsafe code paths
Negative tests exist	Tests for unauthenticated, unauthorized, cross-tenant, invalid input, and rate-limit behavior	Happy-path tests can pass while controls are gone
CI blocks high-confidence regressions	Static analysis, diff-aware scan, and PR gate	Reviewer memory does not scale with AI-generated volume

Why AI-assisted PRs fail differently

AI coding tools make it cheap to generate large, plausible changes. That changes the review problem.

Before AI, the reviewer usually asked:

What bug did this developer introduce?

With AI-generated code, the reviewer also has to ask:

What protection did this refactor delete?

The adoption numbers explain why this is becoming a normal engineering problem, not a niche concern. Stack Overflow's 2025 Developer Survey reported that 84% of developers use or plan to use AI tools, while 46% said they do not trust the accuracy of AI-tool output. Veracode's research on AI-generated code found that only 55% of generated code was secure across 80 coding tasks.

Those numbers do not mean every AI-generated PR is unsafe. They mean human review and automated gates need to adapt.

The highest-risk misses are not always obvious injection sinks or eval() calls. They are often deleted controls in ordinary application code.

OWASP's 2025 Top 10 still lists Broken Access Control as A01. That maps directly to the failures this checklist is trying to catch: missing permission checks, IDOR-style data access, and APIs that expose protected operations without the expected server-side enforcement.

A concrete example: tenant isolation removed by a clean refactor

This is the kind of AI-assisted PR that can pass a quick review.

Before:

@router.get("/customers/{customer_id}")
async def get_customer(
    customer_id: str,
    user: User = Depends(require_user),
    db: Session = Depends(get_db),
):
    return (
        db.query(Customer)
        .filter(Customer.id == customer_id)
        .filter(Customer.tenant_id == user.tenant_id)
        .one()
    )

After:

@router.get("/customers/{customer_id}")
async def get_customer(
    customer_id: str,
    db: Session = Depends(get_db),
):
    return (
        db.query(Customer)
        .filter(Customer.id == customer_id)
        .one()
    )

The new code is shorter.

The endpoint still works.

A happy-path test can still pass.

But two controls are gone:

require_user
Customer.tenant_id == user.tenant_id

That is no longer the same security model. It is now possible for the route to return a customer record without proving the caller belongs to the same tenant.

This is why AI code review security needs to prioritize before-and-after evidence. The danger is not only what the new code contains. It is what the new code no longer contains.

The checklist in detail

1. Did auth or permission checks disappear?

Start here. Do not read helper extraction first. Do not review formatting first.

Search the diff for removed:

@login_required
@permission_required
Depends(require_user)
current_user
is_admin
has_permission
can_read
can_write
require_role
require_scope

Then ask:

Does every protected route still prove who the caller is?
Does every sensitive action still prove what the caller can do?
Did the check move to a shared layer, or did it simply disappear?

GitHub's Copilot code review documentation is explicit that Copilot is not guaranteed to catch every PR issue and should be supplemented with human review. That is the right mental model for all AI review systems: useful assistant, not final security authority.

2. Did tenant scoping disappear?

Tenant scoping is one of the easiest controls to lose in a refactor because it often looks like "query cleanup".

Look for deleted filters involving:

tenant_id
org_id
organization_id
workspace_id
account_id
team_id
owner_id
project_id

High-risk patterns:

# Before
query = query.filter(Document.tenant_id == user.tenant_id)

# After
query = query.filter(Document.id == document_id)

// Before
await db.invoice.findFirst({
  where: { id, organizationId: session.organizationId },
});

// After
await db.invoice.findFirst({
  where: { id },
});

If a PR touches a query for customer data, billing records, exports, files, admin actions, or AI context retrieval, tenant scoping should be reviewed as a first-class security control.

3. Did validation move after the dangerous operation?

AI refactors often move logic into helpers. That is fine until validation moves after the value is used.

Look for:

request body schemas removed
serializer validation weakened
allow_any style configs added
regex or allowlist validation deleted
URL validation replaced with plain string checks
file extension checks replacing content-type checks

Bad pattern:

def import_webhook(payload: dict):
    process_webhook(payload)
    validate_webhook(payload)

The validation exists, but it is too late.

4. Did rate limiting disappear?

Rate limits are easy to delete because they often look like non-business logic.

Review this especially on:

login
password reset
signup
token refresh
export endpoints
webhook endpoints
expensive search
AI chat or agent endpoints

Look for removed:

@rate_limit
limiter.limit
throttle
gateway annotations
queue caps
token or spend caps

For LLM-powered products, rate limits are both a security control and a cost-control boundary.

5. Did CSRF, CORS, or browser security weaken?

This matters when the PR touches routes, middleware, cookies, sessions, forms, or frontend-backed APIs.

Look for:

CSRF middleware removed
csrf_exempt added
CORS widened from one origin to *
cookies changed away from HttpOnly, Secure, or SameSite
security headers removed
auth moved from server-side cookies to local storage without a threat model

Not every API needs CSRF protection in the same way. But every weakening should be intentional and documented.

6. Did audit logging disappear?

Deleted logging can be a security regression even when runtime behavior is unchanged.

Sensitive actions should usually leave an audit trail:

admin role changes
permission changes
billing changes
customer data exports
API key creation
password reset
user impersonation
data deletion
security policy changes

If the PR removes an audit event, the reviewer should ask why.

7. Did the PR add secrets or fake examples?

AI tools often produce realistic examples:

OPENAI_API_KEY = "sk-proj-example"
DATABASE_URL = "postgres://admin:password@prod-db.internal/app"
JWT_SECRET = "change-me"

Even placeholder-looking secrets are a problem because placeholders get copied, reused, and forgotten.

For local scanning, Skylos can run danger and quality checks before the PR gets reviewed:

pipx install skylos
skylos . --danger --quality

The OSS repo is here: github.com/duriantaco/skylos.

8. Did the AI hallucinate imports, packages, or helpers?

This is common in Python and JavaScript/TypeScript.

Look for:

packages not in requirements.txt, pyproject.toml, package.json, or lockfiles
helper functions with plausible names but no definition
library methods that do not exist in your installed version
code comments saying "assume this exists"

This is both a reliability problem and a supply-chain problem. A hallucinated import can become a malicious dependency risk if someone installs a similarly named package just to make the code run.

For a deeper workflow, see How to Verify AI-Generated Python Code and Catch Hallucinated Imports.

9. Did tests cover the security-negative path?

Happy-path tests are not enough for AI-assisted PRs.

For a protected route, require tests for:

unauthenticated request
authenticated but unauthorized request
cross-tenant request
invalid input
missing required scope
rate-limit behavior where relevant

The test that matters for the earlier tenant example is not:

def test_get_customer_returns_200():
    ...

It is:

def test_get_customer_rejects_cross_tenant_access():
    ...

10. Is there a CI gate, or is this all reviewer memory?

Manual review is a weak control if the same issue appears every week.

Use the manual checklist to learn the patterns. Then move the high-confidence checks into automation.

For Skylos:

skylos . --danger --quality
skylos defend .
skylos debt .

For pull requests, use diff-aware scanning where the reviewer needs to know what changed:

skylos . --danger --diff origin/main

For a full local bundle:

skylos suite .

To upload suite results to Skylos Cloud as separate scan families:

skylos suite . --upload

If you want the GitHub Actions workflow, see Python Security Scanner for GitHub Actions.

Where static analysis fits

Static analysis is still necessary. The question is what job each tool is doing.

CodeQL is strong for semantic analysis and GitHub code scanning. Semgrep taint analysis is useful for tracking untrusted data from sources to sinks. Snyk, SonarQube, and other AppSec tools cover broader platform workflows.

But AI code review security adds another question:

Did this PR remove a security control that used to exist?

That is why diff-aware scanning matters. A whole-repo scan can tell you what exists now. A diff-aware review can tell you what disappeared.

Skylos is designed to sit beside those tools as a local-first PR guardrail, especially for:

removed auth checks
tenant-scoping regressions
dead code and cleanup risk
AI-generated code verification
LLM app defense checks
technical debt hotspots
CI and optional cloud upload workflows

For a broader tool comparison, read Best AI Code Security Tools in 2026 Compared.

The minimum review policy for AI-assisted PRs

If you want a simple policy, use this:

Any PR touching auth, billing, admin, exports, tenants, files, webhooks, or AI endpoints gets security review.
The reviewer checks deleted controls before reviewing new helper code.
Cross-tenant negative tests are required for tenant-scoped data.
High-confidence static findings block merge.
Diff-aware regression findings block merge.
Reviewer comments that repeat twice become CI rules.

That last point matters. The goal is not to turn senior engineers into permanent AI babysitters. The goal is to convert repeated review pain into automated gates.

A Dev.to-friendly summary

If you cross-post this to Dev.to, use the canonical URL back to the Skylos site and lead with the concrete failure mode:

AI code review is not just "find the new bug." The scarier failure mode is a clean refactor that removes an auth check, tenant filter, validation branch, rate limit, or audit log while tests still pass.

Suggested Dev.to tags:

ai
security
opensource
codequality

DEV supports canonical URLs for cross-posting, so set the canonical URL to:

https://skylos.dev/blog/ai-code-review-security-pr-checklist

Final takeaway

AI code review security is not about distrusting every AI-generated line. It is about recognizing that AI-assisted PRs create a new review shape.

The risky change may be the thing that disappeared.

Start with the manual checklist:

auth still exists
tenant scoping still exists
validation still runs before use
rate limits still apply
CSRF, CORS, cookies, and headers did not weaken
audit logging still records sensitive actions
secrets were not added
imports and helpers are real
negative tests cover the security path

Then enforce the repeatable parts with local scanning and CI.

Skylos is open source and local-first. Try it here: github.com/duriantaco/skylos.

pipx install skylos
skylos . --danger --quality
skylos . --danger --diff origin/main

If your team wants one command for static analysis, AI defense, technical debt, and optional cloud upload:

skylos suite .
skylos suite . --upload

Sources

Frequently asked questions

What is AI code review security?

AI code review security is the process of reviewing AI-assisted pull requests for security regressions before merge. It checks newly added risky code and deleted controls such as auth checks, tenant scoping, validation, rate limiting, CSRF protection, and audit logging.

Why do AI-generated PRs need a different security checklist?

AI-assisted PRs can be larger, cleaner-looking, and more refactor-heavy than normal human changes. That makes it easier to miss security controls that were removed while the route, handler, or query still appears to work.

Can static analysis catch removed security controls?

Traditional static analysis is good for known vulnerability patterns. Removed controls are easier to catch with diff-aware analysis that compares before and after behavior, especially around routes, middleware, decorators, and tenant-scoped queries.

Is Skylos a replacement for Semgrep, CodeQL, Snyk, or SonarQube?

No. Skylos is best treated as a local-first PR guardrail for AI-assisted code changes, removed security controls, dead code, technical debt, and defense checks. It can run beside broader SAST and AppSec platforms.

Review AI-generated code

Catch AI-introduced regressions before they land

Run Skylos locally or in CI to surface hallucinated imports, removed auth checks, secrets, and risky defaults in AI-generated diffs.

Run Skylos on your repo See the AI PR workflow

Continue exploring

AI Code Review for Security: A PR Checklist for Auth, Tenant Isolation, Validation, and Secrets

Catch AI-introduced regressions before they land

AI Code Review for Security: A PR Checklist for Auth, Tenant Isolation, Validation, and Secrets

TL;DR: the AI code review security checklist

Why AI-assisted PRs fail differently

A concrete example: tenant isolation removed by a clean refactor

The checklist in detail

1. Did auth or permission checks disappear?

2. Did tenant scoping disappear?

3. Did validation move after the dangerous operation?

4. Did rate limiting disappear?

5. Did CSRF, CORS, or browser security weaken?

6. Did audit logging disappear?

7. Did the PR add secrets or fake examples?

8. Did the AI hallucinate imports, packages, or helpers?

9. Did tests cover the security-negative path?

10. Is there a CI gate, or is this all reviewer memory?

Where static analysis fits

The minimum review policy for AI-assisted PRs

A Dev.to-friendly summary

Final takeaway

Sources

Frequently asked questions

Catch AI-introduced regressions before they land

Related reading

Best AI Code Security Tools in 2026 Compared

Slopsquatting in Python: What 205,474 Hallucinated Package Names Mean for Your Supply Chain

How AI-Generated PRs Are Overwhelming Code Review (and How to Fix It)