XPath Builder – a small snippet

As far as I can tell, there is not a build in function within JavaScript to get the XPath of any element easily. However, there are plenty of code snippet out there but this one code snippet from Stackoverflow totally caught my eyes!

xpath_builder

This small paragraph of code is like a feast for people new to JavaScript like me. Not only because of the functional programming style as the author has mentioned, but also the concise and compact writing style, building anonymous functions, some of the latest programming operators but it is also using recursive functions almost as a one liner!

In this blog post, we will patiently walk through it line by line, character by character and appreciate the beauty together. So bring some popcorn and let’s get started.

 

const func = () => {}

Functions in JavaScript can be defined in many ways. Just like getXPathForElement is created in a function declaration in a traditional way as below:

function func() {}

However, using arrow sign, you can create an anonymous function and then assigned to func via an expression. Here is a great Stackoverflow questions that highlighted the difference between two approaches. In my personal view, inside a function, it is probably better to use const with arrow operator but if you are writing a function that will be called somewhere else. It is the best to create a function declaration so it works in the scope of the whole script.

= condition ? then : else

This is such a concise way to write a if statement. However, once you wrap your head around the syntax and become comfortable with what it means. Just like any language operator, like Python list comprehension, it can make your code more concise, instead of 10 lines, it is now a 1 liner that is not necessarily more difficult to read.

idx logic

idx looks like a recursive function to construt the index for a given element. For any given element, if there is no previous sibling, then it is the first child, or only child, it will return 1. However, if there exists any sibling, or siblings, it will calculate the index of its previous sibling and add a value of sib.localName == name on top of it.

name||sib.localName will check if the element has a tag name or if the name variable has been provided, likely be true in most of the case.

sib.localName == name will check if the element localname will match the variable name, for regular DOM element, localname is simply the tag name.

Let’s just ahead a bit and find how idx gets called in the end, idx(elm). In this case, it is actually only passing in one variable which leaves the name variable to be undefined.

function_undefinedundefined_or

By testing, we and find that undefined or with any defined variable will result in the value of the defined variable. And the equal condition will like not be true so the code likely will be interpreted as

idx = sib ? idx(sib.previousElementSibling, sib.localName) : 1

And the next round will be

idx = previousElementSibling ? idx(sib.previouspreviousElementSibling, previousElementSibling.localName || previousElementSibling.localName) + (previouspreviousElementSibling.localName == previousElementSibling.name) : 1

whenever the previousElementSibling shared the same tag

idx = … ? idx () + 1 : 1

This is a very intelligent way constructing the index due to the XPath definition.

xpath_tag_index

I know the logic might sounds a bit complex but in short, it basically find how many other sibling ahead of its share the same tag name. Much easier to understand right?

!elm || elm.nodeType !== 1

This is like the stop condition for the segs function. It will stop only when the elm doesn’t exist or the elm node type is no longer element, for example, text node, .etc. When this condition happens, it will return an empty string.

id(${elm.id})

The next condition check is to find if a given element has an id, in XPath, id is a very unique way to unique locate an element. If the id exists and also if the element found using that id can match the element (I assume there could be invalid id or duplicated id which are invalid). Then it will return id(${elm.id}). This small text is enclosed within the tick sign where ${} will be evaluated and replace with the value of the variable. For example, if the element has an id and the id’s value is abc. The return string will be id(abc).

… rest, spread

Triple dot is a special operator in JavaScript that can be used to capture of the rest of the input arguments, used as a rest operator, or spread the list like variables.

spread_operator

Now we know that […first_part, second_part] will do the job by having the first_part calculating the XPath for its parent and the second_part be the tag and the index of its siblings.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s