Row 2 (Det. PCF) shows the results for when breaks are placed after every punctuation mark and before every function word following a content word. The number of breaks correct increases considerably, but only because of a massive over prediction of break placement. The junctures-correct score and the insertion score are the worst for any of the experiments described here.
Rows 3 (Prob P 1-gram) and 4 (Prob P 6-gram) show the results from our algorithm using a POS sequence model with one tag following and one preceding the juncture ( L = 2, M=1, V={non-punctuation-word, punctuation}). Row 3 shows that the probabilistic punctuation-only algorithm produces very similar results to its deterministic counterpart when a 1-gram phrase break model is used, but that break prediction accuracy increases when a higher order n-gram phrase break model (N=6) is used (row 4). Row 5 (Prob. PCF 1-gram) shows that dividing the non-punctuation class into content words and function words has no significant effect on performance ( L = 2, M=1, V={function, content, punctuation}) compared to row 3. If we look at the relevant POS sequence frequency counts in the training data, we find a that there are 1751 non-break instances and 1002 break instances for the content-function sequence. Thus a non-break is always more probable, and given a 1-gram phrase-break model, breaks will never be inserted. However, if we use a higher order phrase-break model, the combined probability will get high enough to make breaks more probable if the distance from the last break is more than a few words. The figures in row 6 (Prob PCF 6-gram) show this effect.
In summary, the deterministic punctuation algorithm is ``safe'' but massively under-predicts, while the deterministic punctuation/content/function word algorithm massively over-predicts. The probabilistic counterparts only perform acceptably if a high order phrase break model is used.