Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.17] [ML] Data Frame Analytics: Fix field name escaping for Vega based scatterplot matrix. (#193386) #193841

Merged
merged 1 commit into from
Sep 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
[ML] Data Frame Analytics: Fix field name escaping for Vega based sca…
…tterplot matrix. (#193386)

Field names with `\n` would fail to render the DFA scatterplot matrix:

<img width="804" alt="image"
src="https://github.com/user-attachments/assets/26e356b8-236d-4255-b556-2ebc2e5db4fc">

This fixes the escaping and adds unit tests.

The fix isn't 100% ideal because there are cases when we may end up with
an additional backslash being rendered for labels of the scatterplot.
However, all other variations I tried caused rendering problems of the
charts and rendering would fail completely.

For example, just escaping `\n` without the general backslash escaping
causes the following Vega error: `Duplicate scale or projection name:
"child__row_my_numbercolumn_my_number_x"`

On the other hand escaping just the backslash without the additional
`\n` escaping causes an "expression parse error" in in Vega and the
chart wouldn't render.

Note this PR just focuses on escaping for the Vega spec for the
scatterplot matrix. There are still other places in the UI (data grid
headers, fields selector).

<img width="792" alt="image"
src="https://github.com/user-attachments/assets/35532741-7a13-4707-b8da-c72dcc8c935b">

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

(cherry picked from commit 194d630)
  • Loading branch information
walterra committed Sep 24, 2024
commit de15eed7c8e778dc5519923725c4db864f656051
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import { LEGEND_TYPES } from '../vega_chart/common';

import {
getColorSpec,
getEscapedVegaFieldName,
getScatterplotMatrixVegaLiteSpec,
COLOR_OUTLIER,
COLOR_RANGE_NOMINAL,
Expand Down Expand Up @@ -56,6 +57,49 @@ describe('getColorSpec()', () => {
});
});

describe('getEscapedVegaFieldName()', () => {
it('should escape dots in field names', () => {
const fieldName = 'field.name';
const escapedFieldName = getEscapedVegaFieldName(fieldName);
expect(escapedFieldName).toBe('field\\.name');
});

it('should escape brackets in field names', () => {
const fieldName = 'field[name]';
const escapedFieldName = getEscapedVegaFieldName(fieldName);
expect(escapedFieldName).toBe('field\\[name\\]');
});

it('should escape both dots and brackets in field names', () => {
const fieldName = 'field.name[0]';
const escapedFieldName = getEscapedVegaFieldName(fieldName);
expect(escapedFieldName).toBe('field\\.name\\[0\\]');
});

it('should return the same string if there are no special characters', () => {
const fieldName = 'fieldname';
const escapedFieldName = getEscapedVegaFieldName(fieldName);
expect(escapedFieldName).toBe('fieldname');
});

it('should escape newlines in field names', () => {
// String quotes process backslashes, so we need to escape them for
// the test string to contain a backslash. For example, without the
// double backslash, this string would contain a newline character.
const fieldName = 'field\\name';
const escapedFieldName = getEscapedVegaFieldName(fieldName);
expect(escapedFieldName).toBe('field\\\\name');
});

it('should escape backslashes in field names', () => {
// String quotes process backslashes, so we need to escape them for
// the test string to contain a backslash.
const fieldName = 'fieldname\\withbackslash';
const escapedFieldName = getEscapedVegaFieldName(fieldName);
expect(escapedFieldName).toBe('fieldname\\\\withbackslash');
});
});

describe('getScatterplotMatrixVegaLiteSpec()', () => {
it('should return the default spec for non-outliers without a legend', () => {
const data = [{ x: 1, y: 1 }];
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,28 @@ export const getColorSpec = (
return { value: DEFAULT_COLOR };
};

// Escapes the characters .[] in field names with double backslashes
// Escapes the characters .[]\ in field names with double backslashes
// since VEGA treats dots/brackets in field names as nested values.
// See https://vega.github.io/vega-lite/docs/field.html for details.
function getEscapedVegaFieldName(fieldName: string) {
return fieldName.replace(/([\.|\[|\]])/g, '\\$1');
export function getEscapedVegaFieldName(fieldName: string) {
// Note the following isn't 100% ideal because there are cases when we may
// end up with an additional backslash being rendered for labels of the
// scatterplot. However, all other variations I tried caused rendering
// problems of the charts and rendering would fail completely.

// For example, just escaping \n in the first replace without the general
// backslash escaping causes the following Vega error:
// Duplicate scale or projection name: "child__row_my_numbercolumn_my_number_x"

// Escaping just the backslash without the additional \n escaping causes
// causes an "expression parse error" in Vega and the chart wouldn't render.

// Escape newline characters
fieldName = fieldName.replace(/\n/g, '\\n');
// Escape .[]\
fieldName = fieldName.replace(/([\.|\[|\]|\\])/g, '\\$1');

return fieldName;
}

type VegaValue = Record<string, string | number>;
Expand Down