How to Automate a Webpage to Find Broken Links

Posted By Vivek Sharma | 29-Nov-2018

While browsing a site we often come across errors like 404 or 500 error or some broken images we are getting. That's due to broken links which can be verified by manual tester but what if we have 50 links on each page and there are 10 pages. So, we have to test total 500 links to verify if it's broken or not, which can be a tedious task. So the best solution for this is to automate the pages. We can make special utility for checking broken links. So first links are mostly associated with two tags namely, a and img tag. All the links have specific property or attribute, “href” which links it to other pages. So first we have to find the list of all links with a tag.

List<WebElement> linkslist = driver.findElements(By.tagName("a"));


In the above list, we can add elements which are represented by the img tag. So that we have a tag as well img tag in one list with the following line.




We don't have to test for links which don't have href that is they are not href links and will not redirect to other sites. So we will create one more list to store href links and will iterate through linkslist to get all the active links.


List<WebElement> activelinks=new ArrayList<WebElement>();		
for(int i=0;i<linkslist.size();i++) {
if(linkslist.get(i).getAttribute("href")!=null) {


Last two line is to fetch the href links (If not equal to null means exclude all links and images which have href attribute).

Now we Iterate the active links list and checking each href URL by using httpconnection API by using for loop.

System.out.println("size of active linkand imgs are:"+activelinks.size());


New URL() is available which creates URL from string and that URL is casted to httpurl connection class which in turn is used to set the connection any link.

for(int j=0;j<activelinks.size;j++)
HttpURLconnection connection=(HttpURLconnection(newURL(activelinks.get(i).getAttribute("href")).openConnection);
String response=connection.getResponseMessage();


So, the whole program should look like as in below picture. 



So, this way we don't have to check each and every link manually and active and inactive links on the webpage will be known in seconds which will make testing less cumbersome. Thanks for reading.

Request for Proposal

Recaptcha is required.

Sending message..