How To Find Broken Links Using Selenium WebDriver

Posted By : Aditi Nahar | 29-Sep-2018

Broken Links are a part of a website but it is very important to address them timely and regularly. In today’s article, I am going to discuss what are Broken Links, why they should not be there on a website and how to find these links on a webpage using Selenium Webdriver.

I assume the readers of this article must have good knowledge of Selenium Webdriver automation tool.

 

What are Broken Links?  

Broken Links are the links or URLs which are not executable because they are linked to an empty or non-existent webpage. Due to a variety of reasons a link can be non-functional or broken such as page is not available, domain is expired etc.

A broken link can be External i.e it is redirecting to a webpage outside of the domain or can be Internal i.e. redirecting to a webpage of the same domain.

 

What harm Broken Links can cause to a website?

A broken link on a website leaves a bad impression on its viewers and therefore it can hamper websites reputation and business. As the links on web pages affect the rank of a website in search engine results. So, the presence of broken links can highly affect a website’s customer base and its revenue and thus can directly affect business. Looking into these serious consequences, broken links should be regularly detected and immediately handled.

 

But a link cannot be confirmed as a broken one until it is clicked and response against it is checked. A website can have a large number of links so manually checking each link is a very tedious and time-consuming task. Therefore, here in this article, we will discuss how to complete this task with an automation tool.

 

Let’s first check the Steps to find Broken Links on a web page:

  1. Find all the links on a webpage with tags <a> or <img>.

  2. Send HTTP request and get HTTP response from server against each link.

  3. Check through the HTTP response (eg 200, 400, 401, 404, 500) whether the link is valid or broken.

 

Now let’s see the Implementation of these steps :

 

 
 
// Import Packages
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;

public class BrokenLinksHomepage {
	WebDriver driver;

	@BeforeTest
	public void beforeMethod() throws Exception {
		System.setProperty("webdriver.chrome.driver", "<ChromeDriver Path>");

		driver = new ChromeDriver();

		// Enter url.
		driver.get("http://www.homepage.com/");

		driver.manage().window().maximize();
	}

	@Test
	public void checkBrokenLinks() {
		List<WebElement> allLinks = findAllLinks(driver);

		System.out.println("Total number of elements found " + allLinks.size());

		// Identifying and Validating URL
		for (WebElement element : allLinks) {
			try {
				System.out.println("URL: " + element.getAttribute("href") + " returned " + isLinkBroken(new URL(element.getAttribute("href"))));
			}

			catch (Exception exp) {
				System.out.println("At " + element.getAttribute("innerHTML") + "Exception occured -&gt; " + exp.getMessage());
			}
		}
	}

	public static List<WebElement> findAllLinks(WebDriver driver) {

		// Finding all the links on homepage
		List<WebElement> elementList = new ArrayList<WebElement>();

		elementList = driver.findElements(By.tagName("a"));

		elementList.addAll(driver.findElements(By.tagName("img")));

		List<WebElement> finalList = new ArrayList<WebElement>();

		for (WebElement element : elementList) {
			if (element.getAttribute("href") != null) {
				finalList.add(element);
			}
		}
		return finalList;
	}

	public static int isLinkBroken(URL url) throws Exception {
		// url = new URL("http://homepage.com");

		int response;

		// Send HTTP Request
		HttpURLConnection connection = (HttpURLConnection) url.openConnection();

		// Validating Links
		try {
			connection.connect();

			response = connection.getResponseCode();

			connection.disconnect();

			return response;
		}

		catch (Exception exp) {
			System.out.println("Exception occured" + exp.getMessage());
			return 404;
		}
	}
}

 

This implementation of finding Broken Links on a webpage is very simple. Try this and eliminate your manual efforts.

 

Happy Testing:)

 
 
 
Related Tags

About Author

Author Image
Aditi Nahar

Aditi is a certified QA Engineer with a strong command over management tool sets like JIRA and Trello, as well as QA tool sets for API and performance testing. She possesses excellent verbal and written communication skills and has gained valuable experience in management and leadership while collaborating with clients and large teams. Aditi's ability to apply creative thinking and problem-solving skills makes her adept at handling challenging business scenarios. Her proficiency in manual testing has proven instrumental in identifying issues and ensuring the functionality of applications across web, mobile, and TV platforms. She has made significant contributions to both internal and client projects, including Bits2Btc, AUS-BTC, EZBitex, ACL EAP, Scaffold, Iron Systems VRP, Oremus Zoho, and NOWCAST OTT.

Request for Proposal

Name is required

Comment is required

Sending message..