Class TagInfo

java.lang.Object
org.htmlcleaner.TagInfo

public class TagInfo extends Object

Class contains information about single HTML tag.
It also contains rules for tag balancing. For each tag, list of dependent tags may be defined. There are several kinds of dependencies used to reorder tags:

  • fatal tags - required outer tag - the tag will be ignored during parsing (will be skipped) if this fatal tag is missing. For example, most web browsers ignore elements TD, TR, TBODY if they are not in the context of TABLE tag.
  • required enclosing tags - if there is no such, it is implicitly created. For example if TD is out of TR - open TR is created before.
  • forbidden tags - it is not allowed to occur inside - for example FORM cannot be inside other FORM and it will be ignored during cleanup.
  • allowed children tags - for example TR allows TD and TH. If there are some dependent allowed tags defined then cleaner ignores other tags, treating them as not allowed, unless they are in some other relationship with this tag.
  • preferred child tag - where a child tag doesn't match, but we want to by default insert an intervening tag rather than just move it outside. For example, LI in UL, TD in TR.
  • higher level tags - for example for TR higher tags are THEAD, TBODY, TFOOT.
  • tags that must be closed and copied - for example, in <a href="#"><div>.... tag A must be closed before DIV but copied again inside DIV.
  • tags that must be closed before closing this tag and copied again after - for example, in <i><b>at</i> first</b> text tag B must be closed before closing I, but it must be copied again after resulting finally in sequence: <i><b>at</b></i><b> first</b> text .

Tag TR for instance (table row) may define the following dependencies:

  • fatal tag is table
  • required enclosing tag is tbody
  • allowed children tags are td,th
  • higher level tags are thead,tfoot
  • tags that muste be closed before are tr,td,th,caption,colgroup
meaning the following:
  • tr must be in context of table, otherwise it will be ignored,
  • tr may can be directly inside tbody, tfoot and thead, otherwise tbody will be implicitly created in front of it.
  • tr can contain td and th, all other tags and content will be pushed out of current limiting context, in the case of html tables, in front of enclosing table tag.
  • if previous open tag is one of tr, caption or colgroup, it will be implicitly closed.

  • Constructor Details

  • Method Details

    • getAssumedNamespace

      public String getAssumedNamespace()
    • setAssumedNamespace

      public void setAssumedNamespace(String assumedNamespace)
    • getAssumedNamespacePrefix

      public String getAssumedNamespacePrefix()
    • setAssumedNamespacePrefix

      public void setAssumedNamespacePrefix(String assumedNamespacePrefix)
    • defineFatalTags

      public void defineFatalTags(String commaSeparatedListOfTags)
    • defineRequiredEnclosingTags

      public void defineRequiredEnclosingTags(String commaSeparatedListOfTags)
    • defineForbiddenTags

      public void defineForbiddenTags(String commaSeparatedListOfTags)
    • defineAllowedChildrenTags

      public void defineAllowedChildrenTags(String commaSeparatedListOfTags)
    • defineHigherLevelTags

      public void defineHigherLevelTags(String commaSeparatedListOfTags)
    • defineCloseBeforeCopyInsideTags

      public void defineCloseBeforeCopyInsideTags(String commaSeparatedListOfTags)
    • defineCloseInsideCopyAfterTags

      public void defineCloseInsideCopyAfterTags(String commaSeparatedListOfTags)
    • defineCloseBeforeTags

      public void defineCloseBeforeTags(String commaSeparatedListOfTags)
    • getDisplay

      public Display getDisplay()
    • setDisplay

      public void setDisplay(Display display)
    • getName

      public String getName()
    • setName

      public void setName(String name)
    • getContentType

      public ContentType getContentType()
    • getMustCloseTags

      public Set<String> getMustCloseTags()
    • setMustCloseTags

      public void setMustCloseTags(Set<String> mustCloseTags)
    • getHigherTags

      public Set<String> getHigherTags()
    • setHigherTags

      public void setHigherTags(Set<String> higherTags)
    • getChildTags

      public Set<String> getChildTags()
    • setChildTags

      public void setChildTags(Set<String> childTags)
    • getPermittedTags

      public Set<String> getPermittedTags()
    • setPermittedTags

      public void setPermittedTags(Set<String> permittedTags)
    • getCopyTags

      public Set<String> getCopyTags()
    • setCopyTags

      public void setCopyTags(Set<String> copyTags)
    • getContinueAfterTags

      public Set<String> getContinueAfterTags()
    • setContinueAfterTags

      public void setContinueAfterTags(Set<String> continueAfterTags)
    • getRequiredParentTags

      public Set<String> getRequiredParentTags()
    • setRequiredParent

      public void setRequiredParent(String requiredParent)
    • getBelongsTo

      public BelongsTo getBelongsTo()
    • setBelongsTo

      public void setBelongsTo(BelongsTo belongsTo)
    • getFatalTags

      public Set<String> getFatalTags()
    • isFatalTag

      public boolean isFatalTag(String tag)
    • setFatalTag

      public void setFatalTag(String fatalTag)
    • isDeprecated

      public boolean isDeprecated()
    • setDeprecated

      public void setDeprecated(boolean deprecated)
    • isUnique

      public boolean isUnique()
    • setUnique

      public void setUnique(boolean unique)
    • isEmptyTag

      public boolean isEmptyTag()
    • isMinimizedTagPermitted

      public boolean isMinimizedTagPermitted()
      Returns:
    • getPreferredChildTag

      public String getPreferredChildTag()
    • setPreferredChildTag

      public void setPreferredChildTag(String preferredChildTag)