Skip to content

Commit

Permalink
Finished the advanced docs section
Browse files Browse the repository at this point in the history
  • Loading branch information
anarthal committed Feb 11, 2024
1 parent 9e5862b commit 7c07d7b
Show file tree
Hide file tree
Showing 2 changed files with 77 additions and 42 deletions.
15 changes: 10 additions & 5 deletions doc/qbk/22_sql_formatting.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,8 @@ on format contexts to add raw SQL and formatted values without a format string.





[heading:unit_test Unit testing]

If you are composing very complex queries, it's very advisable to unit test them.
Expand Down Expand Up @@ -173,11 +175,14 @@ passed character set. This triggers a `client_errc::invalid_encoding` error:
[sql_formatting_invalid_encoding]

You can validate your strings beforehand or handle the error once
it happened and reject the input.
it happened and reject the input. Other types may also produce format errors.

[tip
If you prefer handling errors with error codes, instead of exceptions,
use [reflink format_sql_to]. Please read
[link mysql.sql_formatting_advanced.error_handling this section] for details.
]

Other types may also produce format errors. Please read
[link mysql.sql_formatting_advanced.error_handling this section] for an in-depth
explanation on SQL formatting error handling.



Expand Down Expand Up @@ -244,7 +249,7 @@ Both client-side SQL formatting and prepared statements have pros and cons effic
]
[
[`std::basic_vector<unsigned char, Allocator>` (including [reflink blob]), [reflink blob_view]]
[Single-quoted, escaped string literal]
[Hex string literal]
[
[sql_formatting_reference_blob]
]
Expand Down
104 changes: 67 additions & 37 deletions doc/qbk/23_sql_formatting_advanced.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -91,67 +91,97 @@ Otherwise, an error will be generated.

[heading:error_handling Error handling model]

* There are unformattable values. floats a doubles can be inf and NaN, and
that's something MySL doesn't have a representation for. Strings can only
be escaped securely if they are encoded according to the current character set.
Attempting to escape an invalid string can lead to vulnerabilities.
* In general, SQL formatting accepts values even if they are invalid
(e.g. out-of-range dates) as long as we're sure MySQL will handle them securely.
Otherwise, we reject.
* Format contexts contain an "error state". This is an error code that can be
set by formatting operations. Once set, further formatting operations can be issued,
but the overall operation fails (monad-like). Rationale: avoid checking codes at every
individual op.
* Set by built-in formatters as described, or by custom formatters by
set_error.
* [refmem basic_format_context get] returns a
[link mysql.error_handling.system_result `boost::system::result`].
If the error state was set at any point, this contains an error. Otherwise, a
value.
* [reflink format_sql_to] may also encounter errors
while processing the format string - e.g. if you provide less arguments than required.
This also sets the error state.
* [reflink format_sql] reports errors via `boost::system::system_error` exceptions.
Creates the context, formats, the checks - as if `.value()` was called.
Some values can't be securely formatted. For instance, C++
`double` can be NaN and infinity, which is not supported by MySQL.
Strings can contain byte sequences that don't represent valid characters,
which makes them impossible to escape securely.

[reflink format_sql] reports errors by throwing `boost::system::system_error` exceptions,
which contain an error code with details about what happened. For instance:

```
// We're trying to format a double infinity value, which is not
// supported by MySQL. This will throw a system_error exception
// with a client_errc::unformattable_value error code
format_sql("SELECT {}", opts, HUGE_VAL);
```

You don't have to use exceptions, though. [reflink basic_format_context] and
[reflink format_sql_to] use [link mysql.error_handling.system_result `boost::system::result`],
instead.

[reflink basic_format_context] contains an error code that is set when formatting
a value fails. This is called the ['error state], and can be queried using [refmem basic_format_context error_state].
When [reflink basic_format_context get] is called (after all individual values have been formatted),
the error state is checked. The `system::result` returned by `get` will contain the error
state if it was set, or the generated query if it was not:

```
format_context ctx (opts);
format_sql_to("SELECT {}, {}", opts, HUGE_VAL, 42);
boost::system::result<std::string> res = std::move(ctx).get();
ASSERT(!res.has_value());
ASSERT(res.has_error());
ASSERT(res.error() == client_errc::unformattable_value);
// res.value() would throw an error, like format_sql would
```

Rationale: the error state mechanism makes composing formatters easier,
as the error state is checked only once.

Errors caused by invalid format strings are also reported using this mechanism.



[heading:format_options Format options and character set tracking]


[*Background]: MySQL has many configuration options that affect its syntax. There are two options that affect how
string values are escaped and formatted:
[heading:format_options Format options and character set tracking]

MySQL has many configuration options that affect its syntax. There are two options
that formatting functions need to know in order to work:

* Whether the backslash character represents a escape sequence or not. By default they do,
but this can be disabled dynamically by setting the
[@https://dev.mysql.com/doc/refman/8.0/en/sql-mode.html#sqlmode_no_backslash_escapes NO_BACKSLASH_ESCAPES] SQL mode.
This is tracked by [reflink any_connection] automatically.
* The connection's [*current character set]. This is tracked by connections as far as possible, but deficiencies
in the protocol create cases where the character set may not be known to the client. This is why
[refmem any_connection format_opts] returns a `boost::system::result<format_options>` that will
contain an error in case the character set is unknown.

This is tracked by [reflink any_connection] automatically (see [refmem any_connection backslash_escapes]).
* The connection's [*current character set]. This determines which multi-byte sequences are valid,
and is required to iterate the string, validate it and escape it. The current character set is tracked
by connections as far as possible, but deficiencies in the protocol create cases where the character
set may not be known to the client. The current character set can be accessed using
[refmem any_connection current_character_set].

[refmem any_connection format_opts] is a convenience function that returns a
[link mysql.error_handling.system_result `boost::system::result`]`<`[reflink format_options]`>`.
If the connection could not determine the current character set, the result will contain an error.
For a reference on how character set tracking works, please read [link mysql.charsets.tracking this section].

[warning
Passing an incorrect `format_options` value to formatting functions may cause
escaping to generate incorrect values, which may generate vulnerabilities.
Stay safe and always use [refmem any_connection format_opts] instead of
hand-crafting `format_options` values.
]




[heading:custom_strings Custom string types]
[heading Custom string types]

[reflink format_sql_to] can be used with string types that are not `std::string`.
Anything satisfying the [reflink OutputString] concept works. This includes
strings with custom allocators (like `std::pmr::string`) or `boost::static_string`.
You need to use [reflink basic_format_context], specifying the string type:

[sql_formatting_custom_string]

You need to create a [reflink basic_format_context], specifying the string type:

[sql_formatting_custom_string]

You can also provide a string value, to re-use memory:

[sql_formatting_memory_reuse]

[heading Re-using string memory]

You can pass a string value to the context's constructor, to re-use memory:

[sql_formatting_memory_reuse]



Expand Down

0 comments on commit 7c07d7b

Please sign in to comment.