-
Notifications
You must be signed in to change notification settings - Fork 123
Serialization Reference
The following data structures exist in the namespaces cista::offset
and cista::raw
:
-
vector<T>
: serializable version ofstd::vector<T>
-
string
: serializable version ofstd::string
-
unique_ptr<T>
: serializable version ofstd::unique_ptr<T>
-
hash_map<K, V>
: serializable version ofstd::unordered_map
(using Google's Swiss Table) -
ptr<T>
: serializable pointer:cista::raw::ptr<T>
is just aT*
,cista::offset::ptr<T>
is a specialized data structure that behaves mostly like aT*
(overloaded->
,*
, etc. operators).
Currently, they do not provide exactly the same interface as their std::
equivalents.
A cista::ptr<T>
can only point to null
or to a value stored in the serialized buffer. Pointing to a value within the serialized buffer requires that the offset it was written at is known at serialization time.
There are three ways to index an address in order to serialize a pointer to it:
-
cista::unique_ptr<T>
: Everycista::unique_ptr<T>
will be indexed. Thus, pointing to values held by acista::unique_ptr<T>
is possible. -
cista::indexed_vector<T>
: Within acista::indexed_vector<T>
, every value can be referenced. This is more efficient than acista::vector<cista::unique_ptr<T>>
. However,cista::vector<T>
andcista::indexed_vector<T>
do not provide pointer stability after non-const operations such asresize
, oremplace_back
. -
cista::indexed<T>
: To be able to point to the value of member variables, it is possible to usecista::indexed<T>
.cista::indexed<T>
inherits fromT
and thus can be used just like aT
.
An example using cista::indexed_vector<T>
and cista::indexed<T>
:
namespace data = cista::offset;
struct node;
struct edge {
data::ptr<node> from_;
data::ptr<node> to_;
};
struct node {
uint32_tid_{0};
data::vector<data::ptr<edge>> edges_;
cista::indexed<data::string> name_;
};
struct graph {
data::indexed_vector<node> nodes_;
data::indexed_vector<edge> edges_;
data::vector<data::ptr<data::string>> node_names_;
};
Serialization and deserialiazation have to use the same mode. This can be ensured by storing the mode in a constexpr
variable. This variable can then be passed to cista::serialize()
and cista::deserialize()
.
The cista::mode
enum provides the following values:
-
NONE
- default mode (default values are listed below) -
UNCHECKED
- do no bounds checks for types (only affects deserialization) -
WITH_VERSION
- store the data structure version (8 byte), default value: off -
WITH_INTEGRITY
- store a hash sum of the serialized data (8 byte), default value: off -
SERIALIZE_BIG_ENDIAN
- use big endian format when serializing (default: little endian) -
DEEP_CHECK
- apply deep checking for security (only affects deserialization) -
CAST
- casts the buffer pointer (with compile time checks that the buffer stays unmodified: no endian conversion and only offset pointer data structures)
The stored data structure version (cista::mode::WITH_VERSION
) and hash sum (cista::mode::WITH_INTEGRITY
) are checked at deserialization (if available).
Note that you cannot store the integrity checksum and/or data structure version and omit the flag at deserialization because they affect where the actual data starts.
These values work as a bit mask.
Example:
constexpr auto const MODE = cista::mode::WITH_VERSION |
cista::mode::WITH_INTEGRITY |
cista::mode::DEEP_CHECK;
The following methods can be used to serialize either to a std::vector<uint8_t>
(default) or to an arbitrary serialization target.
-
std::vector<uint8_t> cista::serialize<mode Mode = mode::NONE, T>(T const&)
serializes an object of typeT
and returns a buffer containing the serialized object. -
void cista::serialize<mode const Mode = mode::NONE, Target, T>(Target&, T const&)
serializes an object of typeT
to the specified target. Targets are eithercista::buf<Buf>
(whereBuf
can either be a simplestd::vector<uint8_t>
or acista::mmap
) orcista::file
. Custom target structs should providewrite
functions as described here.
The following functions exist in cista::offset
and cista::raw
:
-
T* deserialize<T, cista::mode Mode = cista::mode::NONE, Container>(Container&)
deserializes an object from astd::vector<uint8_t>
or similar data structure. This function throws astd::runtimer_error
if the data is not well-formed. -
T* deserialize<T, cista::mode Mode = cista::mode::NONE>(CharT* from, CharT* to = nullptr)
deserializes an object from a pointer range. This function throws astd::runtimer_error
if the data is not well-formed. -
reinterpret_cast<T>(ptr)
: Same asdeserialize<T, cista::mode::CAST>
. If you are using offset mode and the machine endian format is the same as the serialized one, you may as well just call callreinterpret_cast<T>(ptr)
.
Const variants:
-
T* deserialize<T, cista::mode Mode = cista::mode::NONE, Container>(Container const&)
same as non-const variant above. -
T* deserialize<T, cista::mode Mode = cista::mode::NONE>(CharT const* from, CharT const* to = nullptr)
same as non-const variant above.
Note that there are requirements if the input is const
: deserialization of raw pointers (e.g. in most cista::raw data structures except cista::array) as well as endian conversion are not supported as they require modification of the buffer. If you use offset mode and the deserialization does not require endian conversion, const
inputs can be deserialized: they don't need to be modified.