|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.ObjectSegment
Element
public final class Element
Represents an element in a specific source document, which encompasses a start tag, an optional end tag and all content in between.
Take the following HTML segment as an example:
<p>This is a sample paragraph.</p>
The whole segment is represented by an Element
object. This is comprised of the StartTag
"<p>
",
the EndTag
"</p>
", as well as the text in between.
An element may also contain other elements between its start and end tags.
The term normal element refers to an element having a start tag
with a type of StartTagType.NORMAL
.
This comprises all HTML elements and non-HTML elements.
Element
instances are obtained using one of the following methods:
StartTag.getElement()
EndTag.getElement()
Segment.findAllElements()
Segment.findAllElements(String name)
Segment.findAllElements(StartTagType)
HTMLElements
class, and the
XML 1.0 specification for elements.
The three possible structures of an element are listed below:
<img src="mypicture.jpg">
The element consists only of a single start tag and has no element content
(although the start tag itself may have tag content).
getEndTag()
==null
isEmpty()
==true
getEnd()
==
getStartTag()
.
getEnd()
This occurs in the following situations:
<p>This is a sample paragraph.</p>
The element consists of a start tag, content,
and an end tag.
getEndTag()
!=null
.
isEmpty()
==false
(provided the end tag doesn't immediately follow the start tag)
getEnd()
==
getEndTag()
.
getEnd()
.
This occurs in the following situations, assuming the start tag's matching end tag is present in the source document:
<p>This text is included in the paragraph element even though no end tag is present.
<p>This is the next paragraph.
The element consists of a start tag and content,
but no end tag.
getEndTag()
==null
.
isEmpty()
==false
getEnd()
!=
getStartTag()
.
getEnd()
.
This only occurs in an HTML element for which the end tag is optional.
The element ends at the start of a tag which implies the termination of the element, called the implicitly terminating tag. If the implicitly terminating tag is situated immediately after the element's start tag, the element is classed as a single tag element.
See the element parsing rules for HTML elements with optional end tags for details on which tags can implicitly terminate a given element.
See also the documentation of the HTMLElements.getEndTagOptionalElementNames()
method.
StartTag.getElement()
method to construct an element.
The detection of the start tag's matching end tag or other terminating tags always takes into account the possible nesting of elements.
StartTagType.NORMAL
:
isEmptyElementTag()
method for more information.
StartTagType.NORMAL
:
HTMLElements
Field Summary |
---|
Fields inherited from interface HTMLElementName |
---|
A, ABBR, ACRONYM, ADDRESS, APPLET, AREA, B, BASE, BASEFONT, BDO, BIG, BLOCKQUOTE, BODY, BR, BUTTON, CAPTION, CENTER, CITE, CODE, COL, COLGROUP, DD, DEL, DFN, DIR, DIV, DL, DT, EM, FIELDSET, FONT, FORM, FRAME, FRAMESET, H1, H2, H3, H4, H5, H6, HEAD, HR, HTML, I, IFRAME, IMG, INPUT, INS, ISINDEX, KBD, LABEL, LEGEND, LI, LINK, MAP, MENU, META, NOFRAMES, NOSCRIPT, OBJECT, OL, OPTGROUP, OPTION, P, PARAM, PRE, Q, S, SAMP, SCRIPT, SELECT, SMALL, SPAN, STRIKE, STRONG, STYLE, SUB, SUP, TABLE, TBODY, TD, TEXTAREA, TFOOT, TH, THEAD, TITLE, TR, TT, U, UL, VAR |
Method Summary | |
---|---|
Attributes |
getAttributes()
Returns the attributes specified in this element's start tag. |
java.lang.String |
getAttributeValue(java.lang.String attributeName)
Returns the decoded value of the attribute with the specified name (case insensitive). |
java.util.List |
getChildElements()
Returns a list of the immediate children of this element in the document element hierarchy. |
Segment |
getContent()
Returns the segment representing the content of the element. |
java.lang.String |
getDebugInfo()
Returns a string representation of this object useful for debugging purposes. |
int |
getDepth()
Returns the nesting depth of this element in the document element hierarchy. |
EndTag |
getEndTag()
Returns the end tag of the element. |
FormControl |
getFormControl()
Returns the FormControl defined by this element. |
java.lang.String |
getName()
Returns the name of the start tag of this element, always in lower case. |
Element |
getParentElement()
Returns the parent of this element in the document element hierarchy. |
StartTag |
getStartTag()
Returns the start tag of the element. |
boolean |
isEmpty()
Indicates whether this element has zero-length content. |
boolean |
isEmptyElementTag()
Indicates whether this element is an empty-element tag. |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Method Detail |
---|
public Element getParentElement()
The Source.fullSequentialParse()
method should be called after construction of the Source
object if this method is to be used.
This method returns null
for a top-level element,
as well as any element formed from a server tag, regardless of whether it is nested inside a normal element.
See the Source.getChildElements()
method for more details.
null
if this element is a top-level element.getChildElements()
public final java.util.List getChildElements()
The objects in the list are all of type Element
.
See the Source.getChildElements()
method for more details.
getChildElements
in class Segment
null
.getParentElement()
public int getDepth()
The Source.fullSequentialParse()
method should be called after construction of the Source
object if this method is to be used.
A top-level element has a nesting depth of 0
.
An element formed from a server tag always have a nesting depth of 0
,
regardless of whether it is nested inside a normal element.
See the Source.getChildElements()
method for more details.
getParentElement()
public Segment getContent()
This segment spans between the end of the start tag and the start of the end tag. If the end tag is not present, the content reaches to the end of the element.
Note that before version 2.0 this method returned null
if the element was empty,
whereas now a zero-length segment is returned.
null
.public StartTag getStartTag()
public EndTag getEndTag()
If the element has no end tag this method returns null
.
null
if the element has no end tag.public java.lang.String getName()
This is equivalent to getStartTag()
.
getName()
.
See the Tag.getName()
method for more information.
public boolean isEmpty()
This is equivalent to getContent()
.
length()
==0
.
Note that this is a broader definition than that of both the HTML definition of an empty element, which is only those elements whose end tag is forbidden, and the XML definition of an empty element, which is "either a start-tag immediately followed by an end-tag, or an empty-element tag". The other possibility covered by this property is the case of an HTML element with an optional end tag that is immediately followed by another tag that implicitly terminates the element.
true
if this element has zero-length content, otherwise false
.isEmptyElementTag()
public boolean isEmptyElementTag()
It is signified by an empty element with the characters "/>
" at the end of the
start tag.
This is equivalent to isEmpty()
&&
getStartTag()
.
isEmptyElementTag()
.
The StartTag.isEmptyElementTag()
property only checks whether the start tag syntactically an
empty-element tag, whereas this property also makes sure
the element is in fact empty.
A syntactical empty-element tag that is not actually empty can occur if the end tag of an HTML element
is either required or optional,
but the start tag is erroneously terminated with the characters "/>
" in the source document.
All major browsers ignore the syntactical hint of an empty element in this case, even in an
XHTML document, so this parser does the same.
true
if this element is an empty-element tag, otherwise false
.public Attributes getAttributes()
This is equivalent to getStartTag()
.
getAttributes()
.
StartTag.getAttributes()
public java.lang.String getAttributeValue(java.lang.String attributeName)
Returns null
if the start tag of this element does not
have attributes,
no attribute with the specified name exists or the attribute has no value.
This is equivalent to getStartTag()
.
getAttributeValue(attributeName)
.
attributeName
- the name of the attribute to get.
null
if the attribute does not exist or has no value.public FormControl getFormControl()
FormControl
defined by this element.
FormControl
defined by this element, or null
if it is not a control.public java.lang.String getDebugInfo()
Segment
getDebugInfo
in class Segment
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |