Skip to content

Commit

Permalink
MergeTree: Add stress for LocalReferences (#9340)
Browse files Browse the repository at this point in the history
This change leverages the existing stress infra in merge tree, we call them farms as in bug farm as they find lots of bugs. This is in preparation for how we do local reference sliding.

related to #1008
  • Loading branch information
anthony-murphy committed Mar 8, 2022
1 parent 1f8c2bb commit 8e864f7
Show file tree
Hide file tree
Showing 3 changed files with 154 additions and 17 deletions.
9 changes: 9 additions & 0 deletions packages/dds/merge-tree/DEV.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,12 @@ Ths distinction is important, as a removed segment with undefined length may not
However a not yet visible segment with 0 length may already exist, or will eventually exits on all clients.
These have implications for eventually consistent conflict resolution. Generally, we ignore removed segments, and special case invisible segments, like in the case
of conflicting insert as handled in the `breakTie` function

### Zamboni
Zamboni is the garbage collection process in the merge tree. As segment change due to inserts and deletes, we add them to a heap which keeps the segment with the lowest sequence number at the head. These segments drive the zamboni process which is also run on every change. The zamboni process peeks at the heap to determine if the head is below the min sequence, then the segment is eligible. The minimum sequence number is important here, as the minium sequence number is a sequence seen by all clients, and all clients will specify their reference sequence number as above the minium sequence number. This mean that no new operations can come in that reference anything at or below the minimum sequence number, so we are safe to clean up anything we would need to applying incoming. Eligible segments are collected, and then a few different operations are done, superficially, merge, remove, and tree rebalance. Zamboni is incremental, and only collects a constant number of segments at each change so as not to introduce performance issues.

Merge is done if two adjacent segments are of the same type like text, that type is mergable (markers are not), neither are deleted, and all the properties match. The merge process reduces the number of segments, which are leaf nodes of the merge tree. For instance a user may type `c`, `a`, and `t` with each character being it's own operation therefore segment. The user could then highlight that range, and set a property on on all the characters indicating that they are bold, `{bold: true}`. At some later point, these segments would move to the top of th heap, and their sequence numbers would move below the minium sequence number. At that point zamboni could take those individual segments, and merge the into a single segment, `cat` with the property `{bold: true}`

Remove is a bit simpler. On removal of a segment, we track it's removed sequence number. When the segment's removed sequence number drops below the minimum sequence number it can be safely removed from the tree.

Rebalance is a bit different from merge and remove, as it has to do with maintaining the tree itself. After merge or removal there are fewer segments aka leaf nodes in the tree. This allows us to more efficiently pack the non-leaf node of the tree, and potentially remove layers from the tree. This keeps the tree compact, which has both memory, and cpu performance implications.
112 changes: 112 additions & 0 deletions packages/dds/merge-tree/src/test/client.localReferenceFarm.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
/*!
* Copyright (c) Microsoft Corporation and contributors. All rights reserved.
* Licensed under the MIT License.
*/

import { strict as assert } from "assert";
import random from "random-js";
import { LocalReference, ReferenceType } from "..";
import {
IMergeTreeOperationRunnerConfig,
removeRange,
runMergeTreeOperationRunner,
generateClientNames,
IConfigRange,
} from "./mergeTreeOperationRunner";
import { TestClient } from "./testClient";
import { TestClientLogger } from "./testClientLogger";
import { doOverRange } from ".";

const defaultOptions: Record<"initLen" | "modLen", IConfigRange> & IMergeTreeOperationRunnerConfig = {
initLen: {min: 2, max: 4},
modLen: {min: 1, max: 8},
opsPerRoundRange: {min: 10, max: 10},
rounds: 10,
operations: [removeRange],
growthFunc: (input: number) => input * 2,
};

describe("MergeTree.Client", () => {
// Generate a list of single character client names, support up to 69 clients
const clientNames = generateClientNames();

doOverRange(defaultOptions.initLen, defaultOptions.growthFunc, (initLen)=>{
doOverRange(defaultOptions.modLen, defaultOptions.growthFunc, (modLen)=>{
it(`LocalReferenceFarm_${initLen}_${modLen}`, async () => {
const mt = random.engines.mt19937();
mt.seedWithArray([0xDEADBEEF, 0xFEEDBED, initLen, modLen]);

const clients: TestClient[] = new Array(3).fill(0).map(()=> new TestClient());
clients.forEach(
(c, i) => c.startOrUpdateCollaboration(clientNames[i]));

let seq = 0;
// init with random values
seq = runMergeTreeOperationRunner(
mt,
seq,
clients,
initLen,
defaultOptions,
);
// add local references
const refs: LocalReference[][] = [];

const validateRefs = (reason: string, workload: () => void)=>{
const preWorkload = TestClientLogger.toString(clients);
workload();
for(let c = 1; c < clients.length; c++) {
for(let r = 0; r < refs[c].length; r++) {
const pos0 = refs[0][r].toPosition();
const posC = refs[c][r].toPosition();
if(pos0 !== posC) {
assert.equal(
pos0, posC,
`${reason}:\n${preWorkload}\n${TestClientLogger.toString(clients)}`);
}
}
}
// console.log(`${reason}:\n${preWorkload}\n${TestClientLogger.toString(clients)}`)
};

validateRefs("Initialize", ()=>{
clients.forEach((c,i)=>{
refs.push([]);
for(let t = 0; t < c.getLength(); t++) {
const seg = c.getContainingSegment(t);
const lref = new LocalReference(c, seg.segment, seg.offset, ReferenceType.SlideOnRemove);
c.addLocalReference(lref);
lref.addProperties({t});
refs[i].push(lref);
}
});
});

validateRefs("After Init Zamboni",()=>{
// trigger zamboni multiple times as it is incremental
for(let i = clients[0].getCollabWindow().minSeq; i <= seq; i++) {
clients.forEach((c)=>c.updateMinSeq(i));
}
});

validateRefs("After More Ops", ()=>{
// init with random values
seq = runMergeTreeOperationRunner(
mt,
seq,
clients,
modLen,
defaultOptions,
);
});

validateRefs("After Final Zamboni",()=>{
// trigger zamboni multiple times as it is incremental
for(let i = clients[0].getCollabWindow().minSeq; i <= seq; i++) {
clients.forEach((c)=>c.updateMinSeq(i));
}
});
});
});
});
});
50 changes: 33 additions & 17 deletions packages/dds/merge-tree/src/test/testClientLogger.ts
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,14 @@ export function createClientsAtInitialState<TClients extends ClientMap>(
return {...clients, all};
}
export class TestClientLogger {
public static toString(clients: readonly TestClient[]) {
return clients.map((c)=>this.getSegString(c)).reduce<[string,string]>((pv,cv)=>{
pv[0] += `|${cv.acked.padEnd(cv.local.length,"")}`;
pv[1] += `|${cv.local.padEnd(cv.acked.length,"")}`;
return pv;
},["",""]).join("\n");
}

private readonly incrementalLog = false;

private readonly paddings: number[] = [];
Expand All @@ -80,7 +88,7 @@ export class TestClientLogger {
const clientLogIndex = i * 2;

this.ackedLine[clientLogIndex] = getOpString(op.sequencedMessage ?? c.makeOpMessage(op.op));
const segStrings = this.getSegString(c);
const segStrings = TestClientLogger.getSegString(c);
this.ackedLine[clientLogIndex + 1] = segStrings.acked;
this.localLine[clientLogIndex + 1] = segStrings.local;

Expand Down Expand Up @@ -109,14 +117,18 @@ export class TestClientLogger {
}

private addNewLogLine() {
if (this.incrementalLog) {
console.log(this.ackedLine.map((v, i) => v.padEnd(this.paddings[i])).join(" | "));
console.log(this.ackedLine.map((v, i) => v.padEnd(this.paddings[i])).join(" | "));
if(this.incrementalLog) {
while(this.roundLogLines.length > 0) {
const logLine = this.roundLogLines.shift();
if(logLine.some((c)=>c.trim().length > 0)) {
console.log(logLine.map((v, i) => v.padEnd(this.paddings[i])).join(" | "));
}
}
}
this.ackedLine = [];
this.localLine = [];
this.clients.forEach((cc, clientLogIndex)=>{
const segStrings = this.getSegString(cc);
const segStrings = TestClientLogger.getSegString(cc);
this.ackedLine.push("", segStrings.acked);
this.localLine.push("", segStrings.local);

Expand Down Expand Up @@ -153,17 +165,21 @@ export class TestClientLogger {
return baseText;
}

public toString() {
let str =
`_: Local State\n`
+ `-: Deleted\n`
+ `*: Unacked Insert and Delete\n`
+ `${this.clients[0].getCollabWindow().minSeq}: msn/offset\n`
+ `Op format <seq>:<ref>:<client><type>@<pos1>,<pos2>\n`
+ `sequence number represented as offset from msn. L means local.\n`
+ `op types: 0) insert 1) remove 2) annotate\n`;
if (this.title) {
str += `${this.title}\n`;
public toString(excludeHeader: boolean = false) {
let str = "";
if(!excludeHeader) {
str +=
`_: Local State\n`
+ `-: Deleted\n`
+ `*: Unacked Insert and Delete\n`
+ `${this.clients[0].getCollabWindow().minSeq}: msn/offset\n`
+ `Op format <seq>:<ref>:<client><type>@<pos1>,<pos2>\n`
+ `sequence number represented as offset from msn. L means local.\n`
+ `op types: 0) insert 1) remove 2) annotate\n`;

if (this.title) {
str += `${this.title}\n`;
}
}
str += this.roundLogLines
.filter((line) => line.some((c) => c.trim().length > 0))
Expand All @@ -172,7 +188,7 @@ export class TestClientLogger {
return str;
}

private getSegString(client: TestClient): { acked: string, local: string } {
private static getSegString(client: TestClient): { acked: string, local: string } {
let acked: string = "";
let local: string = "";
const nodes = [...client.mergeTree.root.children];
Expand Down

0 comments on commit 8e864f7

Please sign in to comment.