Page Speed Optimization Libraries
1.2.24.1
|
#include "html_lexer.h"
Public Member Functions | |
HtmlLexer (HtmlParse *html_parse) | |
void | StartParse (const StringPiece &id, const ContentType &content_type) |
Initialize a new parse session, id is only used for error messages. | |
void | Parse (const char *text, int size) |
void | FinishParse () |
Completes parse, reporting any leftover text as a final HtmlCharacterEvent. | |
bool | IsImplicitlyClosedTag (HtmlName::Keyword keyword) const |
Determines whether a tag should be terminated in HTML. | |
bool | TagAllowsBriefTermination (HtmlName::Keyword keyword) const |
Determines whether a tag can be terminated briefly (e.g. <tag>) | |
bool | IsOptionallyClosedTag (HtmlName::Keyword keyword) const |
Determines whether it's OK to leave a tag unclosed. | |
void | DebugPrintStack () |
Print element stack to stdout (for debugging). | |
HtmlElement * | Parent () const |
const DocType & | doctype () const |
void | set_size_limit (int64 x) |
Sets the limit on the maximum number of bytes that should be parsed. | |
bool | size_limit_exceeded () const |
Constructs a re-entrant HTML lexer. This lexer minimally parses tags, attributes, and comments. It is intended to parse the Wild West of the Web. It's designed to be tolerant of syntactic transgressions, merely passing through unparseable chunks as Characters.
const DocType& net_instaweb::HtmlLexer::doctype | ( | ) | const [inline] |
Return the current assumed doctype of the document (based on the content type and any HTML directives encountered so far).
HtmlElement* net_instaweb::HtmlLexer::Parent | ( | ) | const |
Returns the current lowest-level parent element in the element stack, or NULL if the stack is empty.
void net_instaweb::HtmlLexer::Parse | ( | const char * | text, |
int | size | ||
) |
Parse a chunk of text, adding events to the parser by calling html_parse_->AddEvent(...).
bool net_instaweb::HtmlLexer::size_limit_exceeded | ( | ) | const [inline] |
Indicates whether we have exceeded the limit on the maximum number of bytes that we should parse.