How To Find Broken Links Using Selenium WebDriver
Posted By : Aditi Nahar | 29-Sep-2018
Broken Links are a part of a website but it is very important to address them timely and regularly. In today’s article, I am going to discuss what are Broken Links, why they should not be there on a website and how to find these links on a webpage using Selenium Webdriver.
I assume the readers of this article must have good knowledge of Selenium Webdriver automation tool.
What are Broken Links?
Broken Links are the links or URLs which are not executable because they are linked to an empty or non-existent webpage. Due to a variety of reasons a link can be non-functional or broken such as page is not available, domain is expired etc.
A broken link can be External i.e it is redirecting to a webpage outside of the domain or can be Internal i.e. redirecting to a webpage of the same domain.
What harm Broken Links can cause to a website?
A broken link on a website leaves a bad impression on its viewers and therefore it can hamper websites reputation and business. As the links on web pages affect the rank of a website in search engine results. So, the presence of broken links can highly affect a website’s customer base and its revenue and thus can directly affect business. Looking into these serious consequences, broken links should be regularly detected and immediately handled.
But a link cannot be confirmed as a broken one until it is clicked and response against it is checked. A website can have a large number of links so manually checking each link is a very tedious and time-consuming task. Therefore, here in this article, we will discuss how to complete this task with an automation tool.
Let’s first check the Steps to find Broken Links on a web page:
-
Find all the links on a webpage with tags <a> or <img>.
-
Send HTTP request and get HTTP response from server against each link.
-
Check through the HTTP response (eg 200, 400, 401, 404, 500) whether the link is valid or broken.
Now let’s see the Implementation of these steps :
// Import Packages
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;
public class BrokenLinksHomepage {
WebDriver driver;
@BeforeTest
public void beforeMethod() throws Exception {
System.setProperty("webdriver.chrome.driver", "<ChromeDriver Path>");
driver = new ChromeDriver();
// Enter url.
driver.get("http://www.homepage.com/");
driver.manage().window().maximize();
}
@Test
public void checkBrokenLinks() {
List<WebElement> allLinks = findAllLinks(driver);
System.out.println("Total number of elements found " + allLinks.size());
// Identifying and Validating URL
for (WebElement element : allLinks) {
try {
System.out.println("URL: " + element.getAttribute("href") + " returned " + isLinkBroken(new URL(element.getAttribute("href"))));
}
catch (Exception exp) {
System.out.println("At " + element.getAttribute("innerHTML") + "Exception occured -> " + exp.getMessage());
}
}
}
public static List<WebElement> findAllLinks(WebDriver driver) {
// Finding all the links on homepage
List<WebElement> elementList = new ArrayList<WebElement>();
elementList = driver.findElements(By.tagName("a"));
elementList.addAll(driver.findElements(By.tagName("img")));
List<WebElement> finalList = new ArrayList<WebElement>();
for (WebElement element : elementList) {
if (element.getAttribute("href") != null) {
finalList.add(element);
}
}
return finalList;
}
public static int isLinkBroken(URL url) throws Exception {
// url = new URL("http://homepage.com");
int response;
// Send HTTP Request
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
// Validating Links
try {
connection.connect();
response = connection.getResponseCode();
connection.disconnect();
return response;
}
catch (Exception exp) {
System.out.println("Exception occured" + exp.getMessage());
return 404;
}
}
}
|
This implementation of finding Broken Links on a webpage is very simple. Try this and eliminate your manual efforts.
Happy Testing:)
Cookies are important to the proper functioning of a site. To improve your experience, we use cookies to remember log-in details and provide secure log-in, collect statistics to optimize site functionality, and deliver content tailored to your interests. Click Agree and Proceed to accept cookies and go directly to the site or click on View Cookie Settings to see detailed descriptions of the types of cookies and choose whether to accept certain cookies while on the site.
About Author
Aditi Nahar
Aditi is a certified QA Engineer with a strong command over management tool sets like JIRA and Trello, as well as QA tool sets for API and performance testing. She possesses excellent verbal and written communication skills and has gained valuable experience in management and leadership while collaborating with clients and large teams. Aditi's ability to apply creative thinking and problem-solving skills makes her adept at handling challenging business scenarios. Her proficiency in manual testing has proven instrumental in identifying issues and ensuring the functionality of applications across web, mobile, and TV platforms. She has made significant contributions to both internal and client projects, including Bits2Btc, AUS-BTC, EZBitex, ACL EAP, Scaffold, Iron Systems VRP, Oremus Zoho, and NOWCAST OTT.