UTF-8 Locator

‌You can make use of the UTF locator to discover bugs on applications that accept user supplied input. In UTF-8, characters are encoded using sequences of 1 to 6 octets. The only octet of a "sequence" of one has the higher-order bit set to 0, the remaining 7 bits being used to encode the character value. In a sequence of n octets, n1, the initial octet has the n higher-order bits set to 1, followed by a bit set to 0. The remaining bit(s) of that octet contain bits from the value of the character to be encoded. The following octet(s) all have the higher-order bit set to 1 and the following bit set to 0, leaving 6 bits in each to contain bits from the character to be encoded. ‌

In a nutshell, this UTF locator could be used to trigger encoding/decoding issues among applications that could later on, after or pre-process, trigger different types of behaviors, and it depends on the application and platform how this is handled. ‌

Fill in "message to include'' ( this is optional ) and use the default amount "10.000" or change it for more or less as needed. Click on "Create UTF'' to run it. This action will create a file called utflocator.txt and will also show the content of the file. Copy the file to its destination or copy the content and add it to a user supplied input to trigger, or try to, a vulnerability.