gstvalue: make gst_string_unwrap less strict

Allow a string in gst_string_unwrap to include unescaped characters that
are not in GST_STRING_IS_ASCII. This extra leniency allows
gst_structure_from_string to, e.g., receive
    name, val=(string)"string with space";

Note that many gst tests, and potentially users, exploited this behaviour
by giving
    name, val="string with space";
i.e. without the (string) type specifier. This was allowed before
because, without a type specifier, the string was passed to
_priv_gst_value_parse_string with unescape set to TRUE, *rather* than
being sent to gst_string_unwrap. This caused a difference in behaviour
between strings that are or are not preceded by (string). E.g.
    name, val=(string)"string with space";
would fail, whilst
    name, val="string with space";
would not. And
    name, val=(string)"\316\261";
would produce a val="α", whereas
    name, val=(string)"\316\261";
would produce a val="316261" (a bug).

The current behaviour is to treat both of these cases the same, which is
desirable. But in order to not break potentially common usage of this
discrepancy (it was in our own tests), the best option is to make string
parsing less strict in general.

New behaviour would be for
    name, val=(string)"string with space";
to pass and give val="string with space", and
    name, val="\316\261";
would produce a val="α".

Also changed deserializing string test to expect successes where
previously a failure was expected.

In a similar way, this also effected the deserializing of GstStructure,
GstCaps, GstTagList and GstCapsFeatures. So, now
    name, val=(structure)"sub-name, sub-val=(string)\"a: \\316\\261\";";
will also pass and give sub-val="a: α". Note that the quote marks
and backslash still need to be escaped for the sub-structure, but other
characters need not be.

Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/303>
This commit is contained in:
Henry Wilkes 2019-10-23 12:48:32 +01:00 committed by GStreamer Merge Bot
parent 454b121ff1
commit 445df0c799
2 changed files with 18 additions and 14 deletions

View file

@ -3931,13 +3931,13 @@ gst_string_unwrap (const gchar * s)
/* if we run into a \0 here, we definitely won't get a quote later */
if (*read == 0)
goto beach;
/* else copy \X sequence */
*write++ = *read++;
}
} else {
/* weird character, error */
} else if (*read == '\0') {
goto beach;
} else {
*write++ = *read++;
}
}
/* if the string is not ending in " and zero terminated, we error */
@ -3975,6 +3975,10 @@ gst_value_deserialize_string (GValue * dest, const gchar * s)
gchar *str = gst_string_unwrap (s);
if (G_UNLIKELY (!str))
return FALSE;
if (!g_utf8_validate (str, -1, NULL)) {
g_free (str);
return FALSE;
}
g_value_take_string (dest, str);
}

View file

@ -801,21 +801,21 @@ GST_START_TEST (test_deserialize_string)
"\"Hello\\ World", "\"Hello\\ World"}, {
"\"\\", "\"\\"}, {
"\"\\0", "\"\\0"}, {
"\"t\\303\\274t\"", "tüt"}, {
/* utf8 octal sequence */
"", ""}, /* empty strings */
{
"\"\"", ""}, /* quoted empty string -> empty string */
"\"\"", ""}, { /* quoted empty string -> empty string */
"\" \"", " "}, { /* allow spaces to be not escaped */
"tüüt", "tüüt"}, /* allow special chars to be not escaped */
/* Expected FAILURES: */
{
"\"\\0\"", NULL}, /* unfinished escaped character */
{
"\"", NULL}, /* solitary quote */
{
"\" \"", NULL}, /* spaces must be escaped */
#if 0
/* FIXME 0.9: this test should fail, but it doesn't */
{
"tüüt", NULL} /* string with special chars must be escaped */
#endif
"\"\\0\"", NULL}, { /* unfinished escaped character */
"\"", NULL}, { /* solitary quote */
"\"\\380\"", NULL}, { /* invalid octal sequence */
"\"\\344\\204\\062\"", NULL}, {
/* invalid utf8: wrong end byte */
"\"\\344\\204\"", NULL} /* invalid utf8: wrong number of bytes */
};
guint i;
GValue v = { 0, };