Skip to content

Commit

Permalink
Refactored binlog event data structures / parsers (#25)
Browse files Browse the repository at this point in the history
Redesigned data structures for storing binlog event data components.
Binlog events now have the following parts (as per MySQL design docs):
* 'common_header' (of fixed length, for binlog V4 19 bytes)
* 'post_header' (of various length depending on event code)
* 'body' (of various length depending on event code)
* optional 'footer' (currently holding CRC32 checksum, 4 bytes, of the
event data)
The footer is always present for 'format description' events. For the
rest of the events, its presence depends on the 'checksum_algorithm'
field from the last seen 'format description' event.

All binlog event-related data structures moved into new 'binsrv::event'
namespace.
'binsrv::event_type' bacame 'binsrv::event::code_type'.
'binsrv::event_flag' became 'binsrv::event::flag_type'.
'binsrv::event_flag_set' became 'binsrv::event::flag_set'.
'binsrv::event_header' became 'binsrv::event::common_header'.
Introduced 'binsrv::event_checksum_algorithm_type' enum (with 2 values
'off' and 'crc32').
Introduced 'binsrv::event::footer' class for holding binlog event footer
data (currently only 'checksum algorithm').
Introduced 'binsrv::event::generic_post_header_impl' class template that
is supposed to be specialized for each known event code with post header
data specific to this particular event.
Introduced 'binsrv::event::generic_body_impl' class template that is
supposed to be specialized for each known event code with body data
specific to this particular event.
For unknown (valid but not meaningful in this project) post headers /
bodies introduced 'binsrv::event::unknown_post_header' and
'binsrv::event::unknown_body' classes which are used as default
specialization.
For empty (containing no data) post headers / bodies introduced
'binsrv::event::empty_post_header' and 'binsrv::event::empty_body'.

Added specializations for post headers / bodies for the following
events:
* format description event
* rotate event

Introduced an aggregate class for storing all binlog event data
components - 'binsrv::event::event'. It was designed to eliminate (with
a few minor exceptions) dynamic memory allocations and virtual function
calls for binlog event data manipulation.
It includes:
* binsrv::event::common_header
* a smart union ('std::variant') of all possible event post headers
  (specializations of 'binsrv::event::generic_post_header_impl' )
* a smart union ('std::variant') of all possible event bodies
  (specializations of 'binsrv::event::generic_body_impl' )
* optional (via 'std::optional') binsrv::event::footer

Reworked event reading loop in the main application: last seen format
description event is now stored outside of the loop to make decisions
about checksum presence in the footers of the events that follow.

For 'trace' log level original event bytes are also printed in the
row (non-hex) form. Non-printable characters are shown as '.'.

Fixed incorrect help message describing expected command line
arguments.

'byte_span' moved from 'easymysql' namespace to 'util' and became a
writable range. Underlying type changed from 'unsigned char' to
'std::byte'. For read-only byte ranges, introduced 'const_byte_span'.
Added 'util::as_string_view()' helper function for these two types.

'extract_fixed_int_from_byte_span()' and
'extract_byte_array_from_byte_span()' static helper functions moved to
the 'util' namespace and became globally available.

Introduced two new utility functions 'util::enum_to_index()' an
'util::index_to_enum()' that simplify the conversion from enums to
'std::size' (via underlying type) and back.

Improved the behavior of the 'to_string()' function for 'util::flag_set'
class template. Unknown bits in the set are now ignored instead of
being printed as empty strings.

More integral constant used used in 'std::size_t' contexts now have
explicit 'U' integer suffix. This should be changed to `UZ` once the
project switches to 'c++23'.

More 'type var = value;' initializations converted to modern
'type var{value}';
  • Loading branch information
percona-ysorokin authored Dec 9, 2023
1 parent dd1fb45 commit 19bafd4
Show file tree
Hide file tree
Showing 69 changed files with 1,886 additions and 441 deletions.
78 changes: 68 additions & 10 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,68 @@ find_package(Boost 1.83.0 EXACT REQUIRED)
find_package(MySQL REQUIRED)

set(source_files
# main application files
src/app.cpp

# binlog event data structure files
src/binsrv/event/checksum_algorithm_type_fwd.hpp
src/binsrv/event/checksum_algorithm_type.hpp

src/binsrv/event/code_type_fwd.hpp
src/binsrv/event/code_type.hpp

src/binsrv/event/common_header_fwd.hpp
src/binsrv/event/common_header.hpp
src/binsrv/event/common_header.cpp

src/binsrv/event/empty_body_fwd.hpp
src/binsrv/event/empty_body.hpp
src/binsrv/event/empty_body.cpp

src/binsrv/event/empty_post_header_fwd.hpp
src/binsrv/event/empty_post_header.hpp
src/binsrv/event/empty_post_header.cpp

src/binsrv/event/event_fwd.hpp
src/binsrv/event/event.hpp
src/binsrv/event/event.cpp

src/binsrv/event/flag_type_fwd.hpp
src/binsrv/event/flag_type.hpp

src/binsrv/event/footer_fwd.hpp
src/binsrv/event/footer.hpp
src/binsrv/event/footer.cpp

src/binsrv/event/format_description_body_impl_fwd.hpp
src/binsrv/event/format_description_body_impl.hpp
src/binsrv/event/format_description_body_impl.cpp

src/binsrv/event/format_description_post_header_impl_fwd.hpp
src/binsrv/event/format_description_post_header_impl.hpp
src/binsrv/event/format_description_post_header_impl.cpp

src/binsrv/event/generic_body_fwd.hpp
src/binsrv/event/generic_body.hpp

src/binsrv/event/generic_post_header_fwd.hpp
src/binsrv/event/generic_post_header.hpp

src/binsrv/event/rotate_body_impl_fwd.hpp
src/binsrv/event/rotate_body_impl.hpp
src/binsrv/event/rotate_body_impl.cpp

src/binsrv/event/rotate_post_header_impl_fwd.hpp
src/binsrv/event/rotate_post_header_impl.hpp
src/binsrv/event/rotate_post_header_impl.cpp

src/binsrv/event/unknown_body_fwd.hpp
src/binsrv/event/unknown_body.hpp

src/binsrv/event/unknown_post_header_fwd.hpp
src/binsrv/event/unknown_post_header.hpp

# billog files
src/binsrv/basic_logger_fwd.hpp
src/binsrv/basic_logger.hpp
src/binsrv/basic_logger.cpp
Expand All @@ -61,16 +121,6 @@ set(source_files
src/binsrv/exception_handling_helpers.hpp
src/binsrv/exception_handling_helpers.cpp

src/binsrv/event_flag_fwd.hpp
src/binsrv/event_flag.hpp

src/binsrv/event_header_fwd.hpp
src/binsrv/event_header.hpp
src/binsrv/event_header.cpp

src/binsrv/event_type_fwd.hpp
src/binsrv/event_type.hpp

src/binsrv/file_logger.hpp
src/binsrv/file_logger.cpp

Expand All @@ -87,6 +137,11 @@ set(source_files
src/binsrv/master_config.hpp
src/binsrv/master_config.cpp

# various utility files
src/util/byte_span_fwd.hpp
src/util/byte_span.hpp
src/util/byte_span_extractors.hpp

src/util/command_line_helpers_fwd.hpp
src/util/command_line_helpers.hpp
src/util/command_line_helpers.cpp
Expand All @@ -112,6 +167,9 @@ set(source_files
src/util/nv_tuple_from_command_line.hpp
src/util/nv_tuple_from_json.hpp

src/util/redirectable.hpp

# mysql wrapper library files
src/easymysql/core_error_helpers_private.hpp
src/easymysql/core_error_helpers_private.cpp
src/easymysql/core_error.hpp
Expand Down
131 changes: 104 additions & 27 deletions src/app.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,56 +5,103 @@
#include <cstdlib>
#include <iomanip>
#include <iostream>
#include <locale>
#include <memory>
#include <sstream>
#include <stdexcept>
#include <string>
#include <string_view>

#include "binsrv/basic_logger.hpp"
#include "binsrv/event_header.hpp"
#include "binsrv/exception_handling_helpers.hpp"
#include "binsrv/log_severity.hpp"
#include "binsrv/logger_factory.hpp"
#include "binsrv/master_config.hpp"

#include "binsrv/event/code_type.hpp"
#include "binsrv/event/event.hpp"

#include "easymysql/binlog.hpp"
#include "easymysql/connection.hpp"
#include "easymysql/connection_config.hpp"
#include "easymysql/library.hpp"

#include "util/byte_span_fwd.hpp"
#include "util/command_line_helpers.hpp"
#include "util/exception_location_helpers.hpp"
#include "util/nv_tuple.hpp"

void log_span_dump(binsrv::basic_logger &logger,
easymysql::binlog_stream_span portion) {
static constexpr std::size_t bytes_per_dump_line = 16;
std::size_t offset = 0;
util::const_byte_span portion) {
static constexpr std::size_t bytes_per_dump_line{16U};
std::size_t offset{0U};
while (offset < std::size(portion)) {
std::ostringstream oss;
oss << '[';
oss << std::setfill('0') << std::hex;
auto sub = portion.subspan(
offset, std::min(bytes_per_dump_line, std::size(portion) - offset));
for (const std::uint8_t current_byte : sub) {
oss << ' ' << std::setw(2) << static_cast<std::uint16_t>(current_byte);
for (auto current_byte : sub) {
oss << ' ' << std::setw(2)
<< std::to_integer<std::uint16_t>(current_byte);
}
const std::size_t filler_length =
(bytes_per_dump_line - std::size(sub)) * 3U;
oss << std::setfill(' ') << std::setw(static_cast<int>(filler_length))
<< "";
oss << " ] ";
const auto &ctype_facet{
std::use_facet<std::ctype<char>>(std::locale::classic())};

for (auto current_byte : sub) {
auto current_char{std::to_integer<char>(current_byte)};
if (!ctype_facet.is(std::ctype_base::print, current_char)) {
current_char = '.';
}
oss.put(current_char);
}
oss << " ]";
logger.log(binsrv::log_severity::trace, oss.str());
offset += bytes_per_dump_line;
}
}

void log_generic_event(binsrv::basic_logger &logger,
const binsrv::event_header &generic_event) {
void log_event_common_header(
binsrv::basic_logger &logger,
const binsrv::event::common_header &common_header) {
std::ostringstream oss;
oss << "ts: " << generic_event.get_readable_timestamp()
<< ", type:" << generic_event.get_readable_type_code()
<< ", server_id:" << generic_event.get_server_id()
<< ", event size:" << generic_event.get_event_size()
<< ", next event position:" << generic_event.get_next_event_position()
<< ", flags: (" << generic_event.get_readable_flags() << ')';
oss << "ts: " << common_header.get_readable_timestamp()
<< ", type:" << common_header.get_readable_type_code()
<< ", server id:" << common_header.get_server_id_raw()
<< ", event size:" << common_header.get_event_size_raw()
<< ", next event position:" << common_header.get_next_event_position_raw()
<< ", flags: (" << common_header.get_readable_flags() << ')';

logger.log(binsrv::log_severity::debug, oss.str());
}

void log_format_description_event(
binsrv::basic_logger &logger,
const binsrv::event::generic_post_header<
binsrv::event::code_type::format_description> &post_header,
const binsrv::event::generic_body<
binsrv::event::code_type::format_description> &body) {
std::ostringstream oss;
oss << '\n'
<< " binlog version : " << post_header.get_binlog_version_raw()
<< '\n'
<< " server version : " << post_header.get_server_version() << '\n'
<< " create timestamp : " << post_header.get_readable_create_timestamp()
<< '\n'
<< " header length : " << post_header.get_common_header_length()
<< '\n'
<< " checksum algorithm: " << body.get_readable_checksum_algorithm()
<< '\n'
<< " post-header length for ROTATE: "
<< post_header.get_post_header_length(binsrv::event::code_type::rotate)
<< '\n'
<< " post-header length for FDE : "
<< post_header.get_post_header_length(
binsrv::event::code_type::format_description);

logger.log(binsrv::log_severity::debug, oss.str());
}
Expand All @@ -70,9 +117,10 @@ int main(int argc, char *argv[]) {

if (number_of_cmd_args != binsrv::master_config::flattened_size + 1 &&
number_of_cmd_args != 2) {
std::cerr << "usage: " << executable_name
<< " <host> <port> <user> <password>\n"
<< " " << executable_name << " <json_config_file>\n";
std::cerr
<< "usage: " << executable_name
<< " <logger_level> <logger_file> <host> <port> <user> <password>\n"
<< " " << executable_name << " <json_config_file>\n";
return exit_code;
}
binsrv::basic_logger_ptr logger;
Expand All @@ -92,7 +140,7 @@ int main(int argc, char *argv[]) {
util::get_readable_command_line_arguments(cmd_args));

binsrv::master_config_ptr config;
if (number_of_cmd_args == 2) {
if (number_of_cmd_args == 2U) {
logger->log(binsrv::log_severity::delimiter,
"Reading connection configuration from the JSON file.");
config = std::make_shared<binsrv::master_config>(cmd_args[1]);
Expand Down Expand Up @@ -155,16 +203,28 @@ int main(int argc, char *argv[]) {
msg += connection.get_character_set_name();
logger->log(binsrv::log_severity::info, msg);

static constexpr std::uint32_t default_server_id = 0;
static constexpr std::uint32_t default_server_id{0U};
auto binlog = connection.create_binlog(default_server_id);
logger->log(binsrv::log_severity::info, "opened binary log connection");

easymysql::binlog_stream_span portion;
// TODO: make sure we write 'Binlog File Header' [ 0xFE 'bin’]` to the
// beginning of the binlog file
// TODO: The first event is either a START_EVENT_V3 or a
// FORMAT_DESCRIPTION_EVENT while the last event is either a
// STOP_EVENT or ROTATE_EVENT. For Binlog Version 4 (current one)
// only FORMAT_DESCRIPTION_EVENT / ROTATE_EVENT pair should be
// acceptable.

// Network streams are requested with COM_BINLOG_DUMP and
// each Binlog Event response is prepended with 00 OK-byte.
static constexpr std::byte expected_event_packet_prefix{'\0'};

util::const_byte_span portion;
binsrv::event::optional_format_description_post_header fde_post_header{};
binsrv::event::optional_format_description_body fde_body{};

while (!(portion = binlog.fetch()).empty()) {
// Network streams are requested with COM_BINLOG_DUMP and
// prepend each Binlog Event with 00 OK-byte.
static constexpr unsigned char expected_event_prefix = '\0';
if (portion[0] != expected_event_prefix) {
if (portion[0] != expected_event_packet_prefix) {
util::exception_location().raise<std::invalid_argument>(
"unexpected event prefix");
}
Expand All @@ -173,8 +233,25 @@ int main(int argc, char *argv[]) {
"fetched " + std::to_string(std::size(portion)) +
"-byte(s) event from binlog");

const binsrv::event_header generic_event{portion};
log_generic_event(*logger, generic_event);
const binsrv::event::event generic_event{portion, fde_post_header,
fde_body};

log_event_common_header(*logger, generic_event.get_common_header());
if (generic_event.get_common_header().get_type_code() ==
binsrv::event::code_type::format_description) {
const auto &local_fde_post_header =
std::get<binsrv::event::generic_post_header<
binsrv::event::code_type::format_description>>(
generic_event.get_post_header());
const auto &local_fde_body = std::get<binsrv::event::generic_body<
binsrv::event::code_type::format_description>>(
generic_event.get_body());

log_format_description_event(*logger, local_fde_post_header,
local_fde_body);
fde_post_header = local_fde_post_header;
fde_body = local_fde_body;
}
log_span_dump(*logger, portion);
}

Expand Down
4 changes: 2 additions & 2 deletions src/binsrv/basic_logger.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ basic_logger::basic_logger(log_severity min_level) noexcept

void basic_logger::log(log_severity level, std::string_view message) {
if (level >= min_level_) {
static constexpr auto timestamp_length =
std::size("YYYY-MM-DDTHH:MM:SS.fffffffff") - 1;
static constexpr auto timestamp_length{
std::size("YYYY-MM-DDTHH:MM:SS.fffffffff") - 1U};
const auto timestamp = boost::posix_time::microsec_clock::universal_time();
;
const auto level_label = to_string_view(level);
Expand Down
48 changes: 48 additions & 0 deletions src/binsrv/event/checksum_algorithm_type.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#ifndef BINSRV_EVENT_CHECKSUM_ALGORITHM_TYPE_HPP
#define BINSRV_EVENT_CHECKSUM_ALGORITHM_TYPE_HPP

#include "binsrv/event/checksum_algorithm_type_fwd.hpp" // IWYU pragma: export

#include <algorithm>
#include <array>
#include <cstdint>
#include <string_view>

namespace binsrv::event {

// NOLINTBEGIN(cppcoreguidelines-macro-usage)
// Checksum algorithm type codes copied from
// https://github.com/mysql/mysql-server/blob/mysql-8.0.35/libbinlogevents/include/binlog_event.h#L425
// clang-format off
#define BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_SEQUENCE() \
BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_MACRO(off , 0), \
BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_MACRO(crc32, 1)
// clang-format on

#define BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_MACRO(X, Y) X = Y
enum class checksum_algorithm_type : std::uint8_t {
BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_SEQUENCE(),
delimiter
};
#undef BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_MACRO

inline std::string_view to_string_view(checksum_algorithm_type code) noexcept {
using namespace std::string_view_literals;
using nv_pair = std::pair<checksum_algorithm_type, std::string_view>;
#define BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_MACRO(X, Y) \
nv_pair { checksum_algorithm_type::X, #X##sv }
static constexpr std::array labels{
BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_SEQUENCE(),
nv_pair{checksum_algorithm_type::delimiter, ""sv}};
#undef BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_MACRO
return std::ranges::find(labels,
std::min(checksum_algorithm_type::delimiter, code),
&nv_pair::first)
->second;
}
#undef BINSRV_CHECKSUM_ALGORITHM_TYPE_XY_SEQUENCE
// NOLINTEND(cppcoreguidelines-macro-usage)

} // namespace binsrv::event

#endif // BINSRV_EVENT_CHECKSUM_ALGORITHM_TYPE_HPP
12 changes: 12 additions & 0 deletions src/binsrv/event/checksum_algorithm_type_fwd.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#ifndef BINSRV_EVENT_CHECKSUM_ALGORITHM_TYPE_FWD_HPP
#define BINSRV_EVENT_CHECKSUM_ALGORITHM_TYPE_FWD_HPP

#include <cstdint>

namespace binsrv::event {

enum class checksum_algorithm_type : std::uint8_t;

} // namespace binsrv::event

#endif // BINSRV_EVENT_CHECKSUM_ALGORITHM_TYPE_FWD_HPP
Loading

0 comments on commit 19bafd4

Please sign in to comment.