Ask questionsDifference in parse and write validation


I encountered some differences between the kParseValidateEncodingFlag and the kWriteValidateEncodingFlag with regrads to surrogates.

In the following example:

int main () {
   const auto input = R"EOS(["\udc4d"])EOS";
   rapidjson::Document doc;
   rapidjson::ParseResult parseResult = doc.Parse<rapidjson::kParseValidateEncodingFlag>(input);
   if(parseResult.IsError()) {
     throw std::runtime_error("Parse Error");
   OStreamWrapper wrapper{std::cout};
   auto writer = Writer<rapidjson::OStreamWrapper, rapidjson::UTF8<>, rapidjson::UTF8<>, rapidjson::CrtAllocator, rapidjson::kWriteValidateEncodingFlag>(wrapper);
   if(!doc.Accept(writer)) {
     throw std::runtime_error("Write Error");

the input (an array with a single low surrogate) is parsed without an error, but later when writing the document to the stream an error is thrown. is this a known issue or am I missing something in the example ? If this is an issue and should be fixed I could provide a pull request that modifies the handling of surrogates in reader.h

Thanks for your work and kind regards


Answer questions miloyip

I think kParseValidateEncodingFlag is not related in this problem, and can be omitted to reproduce this situation. I checked the currently Reader handled unpaired surrogate with single high surrogate, but not single low surrogate:

I think it should generate kParseErrorStringUnicodeSurrogateInvalid as well for leading low surrogate. I have not research much on this. May need to dig more on the standards and try on other implementations. Open for discussion.


Related questions

是否支持流式解析以及多个json分离解析 hot 1
是否支持流式解析以及多个json分离解析 hot 1
RapidJSON causes cc1plus: internal compiler error: Segmentation fault - rapidjson hot 1
GenericMemberIterator::Iterator implicitly declared private when RAPIDJSON_NOMEMBERITERATORCLASS is defined hot 1
Always gets a core dump while creating json string (rapidjson) hot 1
Bazel Support hot 1
Syntax typo in tutorial hot 1
是否支持流式解析以及多个json分离解析 hot 1
作者你好,我 想请教一下,这个框架是否能够将变量中文stirng 转成wchar_t? 在文档只看到定量 hot 1
travis on windows hot 1
Added headers to my project and got thousands of errors hot 1
Added headers to my project and got thousands of errors hot 1
想要的反斜杠 \. 无法表示,否则JSON解析出错 hot 1
Assertion `stack_.GetSize() == sizeof(ValueType)' failed hot 1
Github User Rank List