const char* net_instaweb::HtmlElement::Attribute::DecodedValueOrNull |
( |
| ) |
const |
|
inline |
The result of DecodedValueOrNull() is still owned by this, and will be invalidated by a subsequent call to SetValue().
The result will be a NUL-terminated string containing the value of the attribute, or NULL if the attribute has no value at all (this is distinct from having the empty string for a value), or there is a decoding error. E.g. <tag a="val"> –> "val" <tag a="&"> –> "&" <tag a=""> –> "" <tag a>=""> –> NULL <tag a="muñecos"> –> NULL (decoding_error()==true)
Returns the unescaped value, suitable for directly operating on in filters as URLs or other data. Note that decoding_error() is true if the parsed value from HTML could not be decoded. This might occur if:
- the charset is not known
- the charset is not supported. Currently none are supported and only values that fall in 7-bit ascii can be interpreted.
- the charset is known & supported but the value does not appear to be legal.
The decoded value uses 8-bit characters to represent any unicode code-point less than 256.
const char* net_instaweb::HtmlElement::Attribute::escaped_value |
( |
| ) |
const |
|
inline |
Returns the value in its original directly from the HTML source. This may have HTML escapes in it, such as "&".
Returns the HTML keyword enum. If this attribute name is not recognized, returns HtmlName::kNotAKeyword, and you can examine name_str().
StringPiece net_instaweb::HtmlElement::Attribute::name_str |
( |
| ) |
const |
|
inline |
A large quantity of HTML in the wild has attributes that are improperly escaped. Browsers are generally tolerant of this. But we want to avoid corrupting pages we do not understand. The result of DecodedValueOrNull() and escaped_value() is still owned by this, and will be invalidated by a subsequent call to SetValue() or SetUnescapedValue Returns the attribute name, which is not guaranteed to be case-folded. Compare keyword() to the Keyword constant found in html_name.h for fast attribute comparisons.
QuoteStyle net_instaweb::HtmlElement::Attribute::quote_style |
( |
| ) |
const |
|
inline |
See comment about quote on constructor for Attribute. Returns the quotation mark associated with this URL.
void net_instaweb::HtmlElement::Attribute::SetEscapedValue |
( |
const StringPiece & |
value | ) |
|
Sets the escaped value. This is intended to be called from the HTML Lexer, and results in the Value being computed automatically by scanning the value for escape sequences.
void net_instaweb::HtmlElement::Attribute::SetValue |
( |
const StringPiece & |
value | ) |
|
Two related methods to modify the value of attribute (eg to rewrite dest of src or href). As with the constructor, copies the string in, so caller retains ownership of value.
A StringPiece pointing to an empty string (that is, a char array {'\0'}) indicates that the attribute value is the empty string (e.g. <foo bar="">); however, a StringPiece with a data() pointer of NULL indicates that the attribute has no value at all (e.g. <foo bar>="">). This is an important distinction.
Note that passing a value containing NULs in the middle will cause breakage, but this isn't currently checked for.
- Todo:
- TODO(mdsteele): Perhaps we should check for this?
Sets the value of the attribute. No HTML escaping is expected. This call causes the HTML-escaped value to be automatically computed by scanning the value and escaping any characters required in HTML attributes.
The documentation for this class was generated from the following file: