Skip to content

Commit

Permalink
Gumbo: Fix the segfault that lcamtuf identified through fuzz-testing:
Browse files Browse the repository at this point in the history
http://lcamtuf.coredump.cx/.smieci./gumbo_segv.small

The problem occurred when a formatting element was opened, a <frameset>
resulted in all nodes being popped off the <body>, and then a whitespace
character followed the ending <html> tag.  When the <body> is removed and
replaced with the frameset, I destroy it to prevent a memory leak.
Unfortunately this also frees the formatting element contained in it, but there
are references still on the list of active formatting elements.  When that list
is reconstructed to add the whitespace character at the end of the <html>, the
non-existent formatting nodes are cloned, causing a crash.

The fix I'm doing is just to clear the list of active formatting elements when
the <body> is replaced.  This isn't quite in the spec, but this is enough of an
edge case that I think the deviation is okay.

This also fixes another bug where the ending </html> tag wasn't being properly recorded after framesets.
-------------
Created by MOE: http://code.google.com/p/moe-java
MOE_MIGRATED_REVID=57745367
  • Loading branch information
Jonathan Tang committed Feb 18, 2014
1 parent 87b99f2 commit 713e70b
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 2 deletions.
11 changes: 11 additions & 0 deletions src/parser.c
Original file line number Diff line number Diff line change
Expand Up @@ -2372,6 +2372,13 @@ static bool handle_in_body(GumboParser* parser, GumboToken* token) {
node = pop_current_node(parser);
} while (node != state->_open_elements.data[1]);

// Removing & destroying the body node is going to kill any nodes that have
// been added to the list of active formatting elements, and so we should
// clear it to prevent a use-after-free if the list of active formatting
// elements is reconstructed afterwards. This may happen if whitespace
// follows the </frameset>.
clear_active_formatting_elements(parser);

// Remove the body node. We may want to factor this out into a generic
// helper, but right now this is the only code that needs to do this.
GumboVector* children = &parser->_output->root->v.element.children;
Expand Down Expand Up @@ -3543,6 +3550,10 @@ static bool handle_after_frameset(GumboParser* parser, GumboToken* token) {
} else if (tag_is(token, kStartTag, GUMBO_TAG_HTML)) {
return handle_in_body(parser, token);
} else if (tag_is(token, kEndTag, GUMBO_TAG_HTML)) {
GumboNode* html = parser->_parser_state->_open_elements.data[0];
assert(node_tag_is(html, GUMBO_TAG_HTML));
record_end_of_element(
parser->_parser_state->_current_token, &html->v.element);
set_insertion_mode(parser, GUMBO_INSERTION_MODE_AFTER_AFTER_FRAMESET);
return true;
} else if (tag_is(token, kStartTag, GUMBO_TAG_NOFRAMES)) {
Expand Down
2 changes: 1 addition & 1 deletion tests/test_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@
#include <stdint.h>
#include <string>

#include "gtest/gtest.h"
#include "gumbo.h"
#include "parser.h"
#include "gtest/gtest.h"

inline std::string ToString(const GumboStringPiece& str) {
return std::string(str.data, str.length);
Expand Down
1 change: 0 additions & 1 deletion tests/tokenizer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
#include "gtest/gtest.h"
#include "test_utils.h"


extern const char* kGumboTagNames[];

namespace {
Expand Down

0 comments on commit 713e70b

Please sign in to comment.