#V8 #IntegerOverflow

Looking at the recent vulnerabilities, there are not many vulnerabilities related to ‘V8’, and it seems that there are many cases where vulnerabilities occur in ‘Blink’ or ‘Chrome’ in the end, starting with JavaScript. In the meantime, a recently registered bug caught our attention. PoC (Proof of Concept) that causes integer overflow inside V8 was presented. It was a meaningful analysis to find out about the structure of V8 and what integer overflow is.

RegExp.prototype[@@replace]

The PoC is eventually expressed in the C++ function Runtime_RegExpReplace() (hereinafter RegExpReplace). This function corresponds to the RegExp.prototype[@@replace] method (hereinafter replace) at the JavaScript level. What is RegExp? It can be understood as an object for expressing and processing regular expressions. Take the following code as an example:

var re = /-/g; 

re is a RegExp object. The meaning of “/-/g” substituted for re means all parts that match “-” in an arbitrary sentence. The ‘g’ at the end is an abbreviation for “global” and means to identify all matching parts. Meanwhile, the usage and examples of replace() method suggested by MDN (Mozilla Developers Network) are as follows.

var str = '2016-01-01';
// regexp[Symbol.replace](str, newSubStr|function)
var newstr = re[Symbol.replace](str, '.');
console.log(newstr);  // 2016.01.01

The replace() method is a method of the RegExp object. This method is defined as Symbol type. This type is newly added to the latest ECMA Script, and although you may not know it well, it is omitted because it does not fit the topic of this post. Anyway, the replace() method takes two arguments. The first argument is the original string, and the second argument is the replacement string to replace the part that matches the regular expression in the original string. In the above example, in the original string “2016-01-01”, all ‘-’ are replaced with ‘.’.

RegExp.prototype.exec()

The exec() method of the RegExp object returns the part that matches the regular expression in the original string in the form of an object. Take an example below.

var re = /foo/g;
var result = re.exec('___foo___foo');

The object result contains information about the parts matching the given regular expression in the original string. The following is the output of the object. The original string “___foo___foo” is stored in the input property, and “foo”, a substring matching the regular expression, is stored in the 0 property. Matches twice, but “foo” is one, so length is considered to be 1. Also, since the starting position of the first string matching the regular expression in the original string is 4, it is thought that 4 is stored in the index property.

result = {
  0:"foo"
  index:3
  input:"___foo___foo"
  length:1
  __proto__:Array(0)
}

RegExpUtils::RegExpExec()

The RegExp.prototype.exec() function is considered to call the RegExpExec() function inside V8. Naturally, in order to find and replace a part that matches a regular expression, you have to “find” it first. Therefore, the RegExpReplace() function calls the RegExpExec() function as described in ECMA Script. This function returns result, which is an object of class Object. The following is the result of executing the Print() member function of the Object class for the object.

0x24c761393899: [JSArray]
 - map: 0x3288b1786611 <Map(HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x39681e485539 <JSArray[0]>
 - elements: 0x24c7613938d1 <FixedArray[1]> [HOLEY_ELEMENTS]
 - length: 1
 - properties: 0x17db08a02251 <FixedArray[0]> {
    #length: 0x17db08a4ff89 <AccessorInfo> (const accessor descriptor)
    #index: 3 (data field 0)
    #input: 0x39681e4aa699 <String[12]: ___foo___foo> (data field 1)
    #groups: 0x17db08a022e1 <undefined> (data field 2)
 }
 - elements: 0x24c7613938d1 <FixedArray[1]> {
           0: 0x24c761393879 <String[3]: foo>
 }

The information of the object result is output. 1. It is an object of class JSArray. (JSArray class inherits Object class) 2. It has FixedArray object called elements and properties. 3. The string “foo” is stored in entry 0 of the elements object. 4. length is 1. It means the number of entries in object elements. 5. The properties object has entries called index, input, and group.

Previously, it is thought that the contents of outputting the result object from the JavaScript layer and the contents of outputting the result object from the C++ layer are the same. It seems that the JSArray and Object classes should be looked at separately later.

Flow of Runtime_RegExpReplace

In the same way, when outputting the object result information from the C++ layer for the PoC, it is as follows.

0x2bea81a10e31: [JS_OBJECT_TYPE]
 - map: 0x98850a8cf41 <Map(HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x22efc2684649 <Object map = 0x98850a822b1>
 - elements: 0x7268cc02251 <FixedArray[0]> [HOLEY_ELEMENTS]
 - properties: 0x7268cc02251 <FixedArray[0]> {
    #length: <unboxed double> 4.29497e+09 (data field 0)
 }

This PoC replaces the exec() method with an arbitrary function. The replaced function returns 0xffffffffe. On the other hand, if you look at the above result, you can see that the entry length of properties, which is a FixedArray object, is 4.29497e+09 value of double type. This value is the same as 0xffffffffe. That is, the double type value returned by the exec() method is stored in the length property.

length is the number of parts in the original string that match the regular expression. Therefore, the RegExpReplace() function must replace all parts that match the regular expression while looping through length for its original purpose. These loops can be found in the code as well as in the ECMA Script specification. The RegExpReplace() function stores the value of the length property of the object result in captures_length, an int type variable.

Integer overflow by explicit type conversion

The part where integer overflow is expressed is this link part The variable captures_length receives the value of the length property of the object result through the PositiveNumberToUint32() function. A sense begins to come from the name of the function. This function receives the value of the length property, converts it to uint type, and returns it. The problem is that the variable captures_length that receives this return value is of type int. As a result, even though the value of the length property is 4,294,967,295, the value of the variable captures_length is -2.

Crashing

If you follow the code that follows, the variable argc of type int is calculated as contents_length + 2 and eventually becomes 0. If the exec() method modified by PoC is modified to return 0xfffffffd, the value of argc becomes -1. After all, the RegExpReplace() function after ` Attempts to dynamically allocate an array, in this process Chromium forcibly generates an OOM (Out of Memory) error and the renderer process is terminated.