Skip to content

Commit

Permalink
parser: two-phase parsing
Browse files Browse the repository at this point in the history
With the new changes, the parser returns immediately after the header is parsed
and does not begin parsing the body until the next call to `parse()`. In the
case of bodiless messages and head responses, it directly transitions to the
`complete_in_place` state after the header is parsed, making a call to `parse()`
unnecessary (but still valid).

This two-phase parsing brings a few benefits with almost no complications on the
usage side of the API:

- It introduces an optimization opportunity for users who want to attach a body.
  If they do so immediately after the header is parsed (which seems to be the
  case most of the time), there's no need for `cb1_` for elastic bodies and a
  small `cb1_` for sink bodies (as it will be used temporarily). This means all
  the extra space can be utilized for `cb0_`.
- Because parsing the body might complete with an error, returning after the
  header is parsed allows users to access the header and on the next call to
  parse encounter the error.
- Setting the body limit in the middle of parsing the body or after it doesn't
  make much sense, so returning right after the header is parsed provides a
  window for setting such limits.
- If users want to attach a body, they will almost always do so immediately
  after the header is parsed. By not continuing the parsing of the body, we
  avoid the need for an extra buffer copy operation (in case the user wants to
  attach a buffer).
  • Loading branch information
ashtum committed Jan 12, 2025
1 parent cfc59eb commit c8199d8
Show file tree
Hide file tree
Showing 4 changed files with 421 additions and 330 deletions.
22 changes: 18 additions & 4 deletions include/boost/http_proto/parser.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,18 @@ class BOOST_SYMBOL_VISIBLE
Sink&
set_body(Args&&... args);

/** Sets the maximum allowed size of the body for the current message.
This overrides the default value specified by
@ref config_base::body_limit.
The limit automatically resets to the default
for the next message.
@param n The new body size limit in bytes.
*/
void
set_body_limit(std::uint64_t n);

/** Return the available body data.
The returned buffer span will be invalidated if any member
Expand Down Expand Up @@ -369,9 +381,6 @@ class BOOST_SYMBOL_VISIBLE
bool
is_plain() const noexcept;

void
on_headers(system::error_code&);

BOOST_HTTP_PROTO_DECL
void
on_set_body() noexcept;
Expand All @@ -382,13 +391,17 @@ class BOOST_SYMBOL_VISIBLE
std::size_t,
bool);

std::uint64_t
body_limit_remain() const noexcept;

static constexpr unsigned buffers_N = 8;

enum class state
{
reset,
start,
header,
header_done,
body,
set_body,
complete_in_place,
Expand All @@ -407,10 +420,11 @@ class BOOST_SYMBOL_VISIBLE

detail::workspace ws_;
detail::header h_;
std::size_t body_avail_ = 0;
std::uint64_t body_limit_= 0;
std::uint64_t body_total_ = 0;
std::uint64_t payload_remain_ = 0;
std::uint64_t chunk_remain_ = 0;
std::size_t body_avail_ = 0;
std::size_t nprepare_ = 0;

// used to store initial headers + any potential overread
Expand Down
Loading

0 comments on commit c8199d8

Please sign in to comment.