Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
var JavascriptURIPattern = regexp.MustCompile(
`(?i)=[\s\x0b]*["']?[\s\x0b]*` +
`(?:` +
`(?:&(?:amp;)?#x0*(?:9|a|b|c|d|20);?)` +
`|(?:&(?:amp;)?#0*(?:9|10|11|12|13|32);?)` +
`|(?:&(?:tab|newline);)` +
`|[\s\x0b]` +
`)*` +
`javascript:`,
)
JavascriptURIPattern matches javascript: in attribute contexts only, including HTML-entity-encoded leading whitespace bypasses (tab, LF, VT, FF, CR, space) in hex, decimal, and named forms. Semicolons on numeric entities are optional to match legacy browser behaviour. Double-encoded entity prefixes (&) are also matched.
The pattern avoids false positives on plain text like "JavaScript: a language" by requiring an = before the value (attribute context).
var SuspiciousPageHTMLTokens = []string{
"<script",
"onerror=",
"onload=",
"<iframe",
}
SuspiciousPageHTMLTokens lists substrings that indicate potentially malicious markup in user-supplied page HTML.
Functions ¶
func DetectSuspiciousHTMLTokens ¶ added in v0.18.1
DetectSuspiciousHTMLTokens returns the subset of SuspiciousPageHTMLTokens found in body (case-insensitive), plus "javascript:" if the URI pattern matches. Callers use the result to warn or block page saves.
func SanitizePageHTML ¶
SanitizePageHTML sanitizes rich-text page HTML with a conservative UGC policy.
Types ¶
This section is empty.