HtmlCleaner CleanerProperties 参数配置(转自macken博客,链接:http:macken.iteye.comblog1579809)

Parameter

Defau<

Explanation

advancedXmlEscape true Ifthisparameterissettotrue,ampersandsign()thatproceedsvalidXMLcharactersequences(XXX;)willnotbeescapedwithXXX; transResCharsToNCR false Ifthisparameterissettotrue,reservedXMLsequences(,",apos;,,)areserializedtotheirNumericCharacterRepresentations(#38;,#34;,#39;,#60;,#62;).ThisparameterhaseffectonlyifadvancedXmlEscapeissettotrue. translateSpecialEntities true Iftrue,specialHTMLentities(i.e.?,??,?á)arereplacedwithunicodecharacterstheyrepresent(?,??,?á).Thisdoesn‘tinclude,,,",apos;. transSpecialEntitiesToNCR false Ifthisparameterissettotrue,specialHTMLentities(i.e.|?)areserializedtotheirNumericCharacterRepresentations(#913;).ThisparameterhaseffectonlyiftranslateSpecialEntitiesissettotrue. recognizeUnicodeChars true Iftrue,HTMLcharactersrepresentedbytheircodesinform#XXXX;arereplacedwithrealunicodecharacters(i.e.§?isreplacedwith§?) useCdata true Iftrue,HtmlCleanerwil< reatSCRIPTandSTYLEtagcontentsasCDATAsections,orotherwiseitwillberegardedasordinarytext(specialcharacterswillbeescaped). omitUnknownTags false Tellswhethertoskip(ignore)unknowntagsduringcleanup. treatUnknTagsAsContent false Tellswhethertotreatunknowntagsasordinarycontent,i.e.something...willbetransformedtosomething....ThisattributeisapplicableonlyifomitUnknownTagsissettofalse. omitDeprTags false Tellswhethertoskip(ignore)deprecatedHTMLtagsduringcleanup. treatDeprTagsAsContent false Tellswhethertotreatdeprecatedtagsasordinarycontent,i.e.font...willbetransformedtofont....ThisattributeisapplicableonlyifomitDeprecatedTagsissettofalse. omitComments false TellswhethertoskipHTMLcomments. omitXmlDeclaration false TellswhetherornottoputXMLdeclarationlineatthebeginningoftheresu< ingXML. omitDoctypeDeclaration true TellswhethertoskipHTMLdeclarationfoundinthesourcedocument.IfHTMLdocumentbeingcleaneddoesn‘tcontainoneitwouldn‘tbeplacedintheresu< anyway. omitXmlnsAttributes false Thisflagisdepricatedsinceversion1.3andnamespacesAwareshouldbeusedinstead. omitEnvelope false Tellswhethertoremoveopenandclosetagbeingserialized.ThisparameterisintroducedinHtmlCleaner2.2toreplaceomitHtmlEnvelope.Ifsettotrue,serializationskipsopenandclosetagsofthenode,outputsonlynode‘schildren. useEmptyElementTags true Specifieshowtoserializetagswithemptybody-iftrue,compactnotationisused(xxx/),otherwise-xxx/xxx allowMu< iWordAttributes true Tellsparserwhethertoallowattributevaluesconsistingofmu< iplewordsornot.Iftrue,attributeatt="abc"willstaylikeitis,andiffalseparserwillsplitthisintoatt="a"b="b"c="c"(thisisdefau< browsers‘behaviour). allowHtmlInsideAttributes false Tellsparserwhethertoallowhtm< agsinsideattributevalues.Forexample,whenthisflagissetatt="hereisahref=‘xxxx‘link/a"willstaylikeitis,andifnot,parserwillendattributevalueafter"hereis".
ThisflagmakessenseonlyifallowMu< iWordAttributesissetaswell. ignoreQuestAndExclam true Tellsparserwhethertocompletelyignoretagsthathaveform?TAGNAME....or!TAGNAME.....ThiswaysomeHTML/XMLprocessinginstructionsmaybeomittedfromtheresu< ingxml. namespacesAware true Iftrue,namespaceprefixesfoundduringparsingwillbepreservedandallneccesseryxmlnamespacedeclarationswillbeaddedintherootelement.Iffalse,allnamespaceprefixesandallxmlnsnamespacedeclarationswillbestripped. hyphenReplacement = XMLdoesn‘tallowdoublehyphensequence(--)insidecomments.Thisparametertellswhichreplacementtouseforitwhendoublehyphenisencounteredduringparsing. pruneTags emptystring Comma-separatedlistoftagsthatwillbecomplitelyremoved(withallnestedelements)fromXMLtreeafterparsing.ForexampeifpruneTagsis"script,style",resu< ingXMLwillnotcontainscriptsandstyles. booleanAtts self Tellscleanerwhatvaluetogivetobooleanattributes,likechecked,selectedandsimilar.Allowedvaluesareself-valueofattributeisthesameasattributename(checked="checked"),empty-attributevalueisemptystring(checked="")andtrue-valueofattributeis"true"(checked="true"). nodeByXpath XPathexpressionusedtoselectfirstnodethatisgoin >obeserializedinsteadofwholeHTMLdocument.Forexampleifthisparameterussetto//table[1]onlyfirsttableindocumentwillbeserialized.


HtmlCleaner CleanerProperties 参数配置(转自macken博客,链接:http://macken.iteye.com/blog/1579809)

原文地址:http://www.cnblogs.com/yigui/p/7274728.html


最新回复(0)
/jishuEh2CrvD4jgmI_2BQ19mHTXaHyNJgVy8QFrDIiXvQ_3D_3D4719247
8 简首页