Class AvoidEscapedUnicodeCharactersCheck

  • All Implemented Interfaces:
    Configurable, Contextualizable

    public class AvoidEscapedUnicodeCharactersCheck
    extends AbstractCheck

    Restrict using Unicode escapes (such as \u221e). It is possible to allow using escapes for non-printable(control) characters. Also, this check can be configured to allow using escapes if trail comment is present. By the option it is possible to allow using escapes if literal contains only them. By the option it is possible to allow using escapes for space literals.

    Examples of using Unicode:

     String unitAbbrev = "μs";      // Best: perfectly clear even without a comment.
     String unitAbbrev = "\u03bcs"; // Poor: the reader has no idea what this is.
     

    An example of how to configure the check is:

     <module name="AvoidEscapedUnicodeCharacters"/>
     

    An example of non-printable(control) characters.

     return '\ufeff' + content; // byte order mark
     

    An example of how to configure the check to allow using escapes for non-printable(control) characters:

     <module name="AvoidEscapedUnicodeCharacters">
         <property name="allowEscapesForControlCharacters" value="true"/>
     </module>
     

    Example of using escapes with trail comment:

     String unitAbbrev = "\u03bcs"; // Greek letter mu, "s"
     

    An example of how to configure the check to allow using escapes if trail comment is present:

     <module name="AvoidEscapedUnicodeCharacters">
         <property name="allowByTailComment" value="true"/>
     </module>
     

    Example of using escapes if literal contains only them:

     String unitAbbrev = "\u03bc\u03bc\u03bc";
     

    An example of how to configure the check to allow escapes if literal contains only them:

     <module name="AvoidEscapedUnicodeCharacters">
        <property name="allowIfAllCharactersEscaped" value="true"/>
     </module>
     

    An example of how to configure the check to allow non-printable escapes:

     <module name="AvoidEscapedUnicodeCharacters">
        <property name="allowNonPrintableEscapes" value="true"/>
     </module>
     
    • Field Detail

      • MSG_KEY

        public static final java.lang.String MSG_KEY
        A key is pointing to the warning message text in "messages.properties" file.
        See Also:
        Constant Field Values
      • UNICODE_REGEXP

        private static final java.util.regex.Pattern UNICODE_REGEXP
        Regular expression for Unicode chars.
      • UNICODE_CONTROL

        private static final java.util.regex.Pattern UNICODE_CONTROL
        Regular expression Unicode control characters.
        See Also:
        Appendix:Control characters
      • ALL_ESCAPED_CHARS

        private static final java.util.regex.Pattern ALL_ESCAPED_CHARS
        Regular expression for all escaped chars.
      • ESCAPED_BACKSLASH

        private static final java.util.regex.Pattern ESCAPED_BACKSLASH
        Regular expression for escaped backslash.
      • NON_PRINTABLE_CHARS

        private static final java.util.regex.Pattern NON_PRINTABLE_CHARS
        Regular expression for non-printable unicode chars.
      • singlelineComments

        private java.util.Map<java.lang.Integer,​TextBlock> singlelineComments
        Cpp style comments.
      • blockComments

        private java.util.Map<java.lang.Integer,​java.util.List<TextBlock>> blockComments
        C style comments.
      • allowEscapesForControlCharacters

        private boolean allowEscapesForControlCharacters
        Allow use escapes for non-printable(control) characters.
      • allowByTailComment

        private boolean allowByTailComment
        Allow use escapes if trail comment is present.
      • allowIfAllCharactersEscaped

        private boolean allowIfAllCharactersEscaped
        Allow if all characters in literal are escaped.
      • allowNonPrintableEscapes

        private boolean allowNonPrintableEscapes
        Allow escapes for space literals.
    • Constructor Detail

      • AvoidEscapedUnicodeCharactersCheck

        public AvoidEscapedUnicodeCharactersCheck()
    • Method Detail

      • setAllowEscapesForControlCharacters

        public final void setAllowEscapesForControlCharacters​(boolean allow)
        Set allowIfAllCharactersEscaped.
        Parameters:
        allow - user's value.
      • setAllowByTailComment

        public final void setAllowByTailComment​(boolean allow)
        Set allowByTailComment.
        Parameters:
        allow - user's value.
      • setAllowIfAllCharactersEscaped

        public final void setAllowIfAllCharactersEscaped​(boolean allow)
        Set allowIfAllCharactersEscaped.
        Parameters:
        allow - user's value.
      • setAllowNonPrintableEscapes

        public final void setAllowNonPrintableEscapes​(boolean allow)
        Set allowSpaceEscapes.
        Parameters:
        allow - user's value.
      • getDefaultTokens

        public int[] getDefaultTokens()
        Description copied from class: AbstractCheck
        Returns the default token a check is interested in. Only used if the configuration for a check does not define the tokens.
        Specified by:
        getDefaultTokens in class AbstractCheck
        Returns:
        the default tokens
        See Also:
        TokenTypes
      • getAcceptableTokens

        public int[] getAcceptableTokens()
        Description copied from class: AbstractCheck
        The configurable token set. Used to protect Checks against malicious users who specify an unacceptable token set in the configuration file. The default implementation returns the check's default tokens.
        Specified by:
        getAcceptableTokens in class AbstractCheck
        Returns:
        the token set this check is designed for.
        See Also:
        TokenTypes
      • getRequiredTokens

        public int[] getRequiredTokens()
        Description copied from class: AbstractCheck
        The tokens that this check must be registered for.
        Specified by:
        getRequiredTokens in class AbstractCheck
        Returns:
        the token set this must be registered for.
        See Also:
        TokenTypes
      • beginTree

        public void beginTree​(DetailAST rootAST)
        Description copied from class: AbstractCheck
        Called before the starting to process a tree. Ideal place to initialize information that is to be collected whilst processing a tree.
        Overrides:
        beginTree in class AbstractCheck
        Parameters:
        rootAST - the root of the tree
      • hasUnicodeChar

        private static boolean hasUnicodeChar​(java.lang.String literal)
        Checks if literal has Unicode chars.
        Parameters:
        literal - String literal.
        Returns:
        true if literal has Unicode chars.
      • isOnlyUnicodeValidChars

        private static boolean isOnlyUnicodeValidChars​(java.lang.String literal,
                                                       java.util.regex.Pattern pattern)
        Check if String literal contains Unicode control chars.
        Parameters:
        literal - String literal.
        pattern - RegExp for valid characters.
        Returns:
        true, if String literal contains Unicode control chars.
      • hasTrailComment

        private boolean hasTrailComment​(DetailAST ast)
        Check if trail comment is present after ast token.
        Parameters:
        ast - current token.
        Returns:
        true if trail comment is present after ast token.
      • isTrailingBlockComment

        private static boolean isTrailingBlockComment​(TextBlock comment,
                                                      java.lang.String line)
        Whether the C style comment is trailing.
        Parameters:
        comment - the comment to check.
        line - the line where the comment starts.
        Returns:
        true if the comment is trailing.
      • countMatches

        private static int countMatches​(java.util.regex.Pattern pattern,
                                        java.lang.String target)
        Count regexp matches into String literal.
        Parameters:
        pattern - pattern.
        target - String literal.
        Returns:
        count of regexp matches.
      • isAllCharactersEscaped

        private boolean isAllCharactersEscaped​(java.lang.String literal)
        Checks if all characters in String literal is escaped.
        Parameters:
        literal - current literal.
        Returns:
        true if all characters in String literal is escaped.