net_instaweb::HtmlLexer Class Reference

#include "html_lexer.h"

List of all members.

Public Member Functions

 HtmlLexer (HtmlParse *html_parse)
void StartParse (const StringPiece &id, const ContentType &content_type)
 Initialize a new parse session, id is only used for error messages.
void Parse (const char *text, int size)
void FinishParse ()
 Completes parse, reporting any leftover text as a final HtmlCharacterEvent.
bool IsImplicitlyClosedTag (HtmlName::Keyword keyword) const
 Determines whether a tag should be terminated in HTML.
bool TagAllowsBriefTermination (HtmlName::Keyword keyword) const
 Determines whether a tag can be terminated briefly (e.g. <tag>).
bool IsOptionallyClosedTag (HtmlName::Keyword keyword) const
 Determines whether it's OK to leave a tag unclosed.
void DebugPrintStack ()
 Print element stack to stdout (for debugging).
HtmlElementParent () const
 Returns the current lowest-level parent element in the element stack.
const DocTypedoctype () const

Detailed Description

Constructs a re-entrant HTML lexer. This lexer minimally parses tags, attributes, and comments. It is intended to parse the Wild West of the Web. It's designed to be tolerant of syntactic transgressions, merely passing through unparseable chunks as Characters.

Todo:
TODO(jmarantz): refactor this with html_parse, so that this class owns the symbol table and the event queue, and no longer needs to mutually depend on HtmlParse. That will make it easier to unit-test.

Member Function Documentation

const DocType& net_instaweb::HtmlLexer::doctype (  )  const [inline]

Return the current assumed doctype of the document (based on the content type and any HTML directives encountered so far).

void net_instaweb::HtmlLexer::Parse ( const char *  text,
int  size 
)

Parse a chunk of text, adding events to the parser by calling html_parse_->AddEvent(...).


The documentation for this class was generated from the following file:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines
Generated on Tue May 29 16:33:48 2012 for Page Speed Optimization Libraries by  doxygen 1.6.3