OK I figured it out. Here's my best stab at how it works based on trial-and-error. If anyone knows better please comment:
The dictionary is a byte array of null terminated strings used in the dictionary. Each string can be one or more characters. If you know that you expect certain patterns frequently, use them and put the most important ones toward the end of the dictionary. For example, if you are compressing XML data with well-known tag names, you could put the tags in the dictionary after letters, numbers, and punctuation that might be in your input, e.g. (assume a zero-byte after each string):
A
B
C
(etc)
Z
a
b
(etc)
z
<Customer>
<Addess>
<PaymentInfo>
Here is a snippet of C# code that I used to generate C# code to represent my dictionary (which is pre-compiled into the app as static data). It's taken from a Windows Forms app that has one text box for the strings of the dictionary (one per line) and one text box for the output C# code (that I then paste into a static byte array in the app that needs the dictionary)
private string MakeBytes(string line, bool appendNull)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < line.Length; i++)
{
char ch = line[i];
sb.Append("(byte)'");
if (char.IsLetterOrDigit(ch) || char.IsPunctuation(ch))
{
sb.Append(ch);
}
else
{
string x = ((int)ch).ToString("x2");
sb.Append(@"\x").Append(x);
}
sb.Append(
"', ");
} if (appendNull)
{
sb.Append(@"(byte)'\0', ");
}
return sb.ToString();
}private void btnConvert_Click(object sender, EventArgs e)
{
string[ lines = txtInput.Text.Split(new string[ { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
string buffer = string.Empty; foreach (string line in lines)
{
buffer += MakeBytes(line, true);
if (buffer.Length >= 80)
{
txtOutput.Text += buffer +
Environment.NewLine;
buffer = string.Empty;
}
}
}