Page MenuHomeFreeBSD

Add Semantic Line Break checking with vale
ClosedPublic

Authored by bofh on Apr 13 2023, 5:24 PM.
Tags
Referenced Files
Unknown Object (File)
Oct 12 2024, 6:06 PM
Unknown Object (File)
Oct 12 2024, 6:06 PM
Unknown Object (File)
Oct 12 2024, 6:06 PM
Unknown Object (File)
Oct 12 2024, 6:06 PM
Unknown Object (File)
Oct 12 2024, 6:06 PM
Unknown Object (File)
Oct 5 2024, 7:00 PM
Unknown Object (File)
Oct 5 2024, 1:39 AM
Unknown Object (File)
Oct 2 2024, 1:20 PM

Details

Summary

In our doc guidelines we use "One Sentence Per Line" or more technically Semantic Line Breaking. Use vale to implement this.

Diff Detail

Repository
R9 FreeBSD doc repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

bofh requested review of this revision.Apr 13 2023, 5:24 PM
bofh created this revision.
bofh added a project: doceng.

I have tested it successfully on this quarter status reports.

Two suggestions for improvements:

  • I had two false positives due to abbreviations (Inc. and e.g.);
  • it is a bit annoying that it always stops analyzing files at the first error, it would be nicer if all wrong lines were reported at once. Maybe changing level: error to something lower would be a good idea.

Maybe also remove max: 1.

No, sorry: this must stay. I thought it refered to how many times the rule was allowed to fail before stopping checking the file, but I realized that it counts the number of sentences instead.

I have tested it successfully on this quarter status reports.

Two suggestions for improvements:

  • I had two false positives due to abbreviations (Inc. and e.g.);

Ah yes this is fixable. I will need to add another rule.

  • it is a bit annoying that it always stops analyzing files at the first error, it would be nicer if all wrong lines were reported at once. Maybe changing level: error to something lower would be a good idea.

Oh damn yeah. I never meant to set it to error but warning. I will update the diff.

Maybe also remove max: 1.

No, sorry: this must stay. I thought it refered to how many times the rule was allowed to fail before stopping checking the file, but I realized that it counts the number of sentences instead.

Yes this counts the number of sentences.

I have tested it successfully on this quarter status reports. …

Can you recall, were there any multi-sentence list items (bullet points)?

Thinking back a few months … my perceptions of ambiguity might have differed from other people's perceptions.

I have tested it successfully on this quarter status reports. …

Can you recall, were there any multi-sentence list items (bullet points)?

Thinking back a few months … my perceptions of ambiguity might have differed from other people's perceptions.

Yes, we have some in this quarter FreeBSD Foundation status report, which I am fixing in https://reviews.freebsd.org/D39567#change-bq5lJMuUrbA0.

Thanks. Running vale here (with diff 120268), I see that it catches line 100, but not the list items.

Thanks. Running vale here (with diff 120268), I see that it catches line 100, but not the list items.

Have you tried switching from level: error to level: warning as suggested in the above comments?

Have you tried switching from level: error to level: warning as suggested in the above comments?

Phabricator hid the comments, thanks for the hint.

No difference:

% vale website/content/en/status/report-2023-01-2023-03/freebsd-foundation.adoc | grep sentence
 110:98   warning  Only use one sentence per       FreeBSD.SemanticLineBreak 
% grep level .vale/styles/FreeBSD/SemanticLineBreak.yml
level: warning
%

I guess, bullet points (or list items or whatever) are exempt from paragraph scope, even when the points are integral to what the human sees as a paragraph.

From https://github.com/errata-ai/vale/discussions/512#discussioncomment-4112643:

You could also use script for more fine-grained control.

I'm not troubled by the bullet point example, just curious.

Since I remember I had spotted those multisentences list items through vale, I have tested it again. I confirm what @grahamperrin reports: even with level: warning it only reports line 110. I add that the same happens with level: suggestion.

However, if line 110 is fixed, then vale reports line 89. When line 89 is fixed, it reports line 91 and so on. So, for some reason, vale applies the rule only once.

… if line 110 is fixed, then vale reports line 89. When line 89 is fixed, it reports line 91 and so on. So, for some reason, vale applies the rule only once.

% vale documentation/content/en/books/handbook/cutting-edge/_index.adoc | grep Semantic
 801:69   warning  Only use one sentence per       FreeBSD.SemanticLineBreak 
%

This fails to catch at least two occurrences earlier in the file. https://github.com/freebsd/freebsd-doc/pull/164/commits/6fd0575ff225da6cb748f248b01d0c280ab29302:

  • lines 250 and 254.

Toying with scopes:

% vale documentation/content/en/books/handbook/cutting-edge/_index.adoc | grep Semantic
 250:90   warning  Only use one sentence per       FreeBSD.SemanticLineBreak 
% grep scope .vale/styles/FreeBSD/SemanticLineBreak.yml
scope: summary
% nano /usr/doc/.vale/styles/FreeBSD/SemanticLineBreak.yml
% grep scope .vale/styles/FreeBSD/SemanticLineBreak.yml
scope: paragraph
% vale documentation/content/en/books/handbook/cutting-edge/_index.adoc | grep Semantic
 801:69   warning  Only use one sentence per       FreeBSD.SemanticLineBreak 
%

Modify warning level to suggestion

There is another scope called list which might be helpful for bullet points.

Handle semantic line breaking in lists.

Since I remember I had spotted those multisentences list items through vale, I have tested it again. I confirm what @grahamperrin reports: even with level: warning it only reports line 110. I add that the same happens with level: suggestion.

However, if line 110 is fixed, then vale reports line 89. When line 89 is fixed, it reports line 91 and so on. So, for some reason, vale applies the rule only once.

Unless you have already committed 89/90/110 now it should display all the suggestion. :D

I have checked out the repository to a version before the commit and tested. I confirm that it now works as expected, showing lines 89, 90, 91 and 110 all at once, thanks @bofh.

This revision is now accepted and ready to land.Apr 18 2023, 6:12 PM

@salvadore Somewhere mentioned about some exceptions with Inc. and e.g. which will contain . but not apparently at the end of line. This patch handles this exception.

@grahamperrin I have seen that thread as it was created by me. However the way vale operates is based on rules naming preference so in case a line has multiple errors it will show only one and will generate another once the previous is fixed. So I believe we are good with this.

This revision now requires review to proceed.Apr 19 2023, 6:25 PM
This revision is now accepted and ready to land.Apr 19 2023, 6:32 PM