While reorganising the XML SiteMaps with the help of Python, I came to know about reorganizing the XML sitemaps for resolving the indexing issues. In the same way, if you are following a similar process and you have got dozens of important URLs that are not indexed. You have to understand that the main issue lies in the indexing. In this blog, you will know about how you can automate the URL inspection tool with the help of Python and JavaScript.
In these days, the URL Inspection tool is becoming very popular, and it has become an indispensable way to perform the technical SEO. This tool has the 5 main cases, and they are-
No API for URL Automation Tool
If you have ever tried to automate the URL Automation tool, you might feel frustrated to know that no API is available. So, why you will automate it without the no API? I will tell you about the most powerful process that is mainly used by the software assurance teams- web browser automation. Google has various explicit terms and banned the automated queries, but the wording can appear to be focused on Google searches as used by the search engine ranking trackers.
Getting Overall Approach
You will connect the browser the automation code to running mode of the Chrome browser. Then you will manually log in to the Google Search Console this time. Once the console gets connected, the browser will be directed to open the pages, click on the page elements, extract contents and many more.
How to get started- Python and JavaScript
You can write the content extraction code in the JavaScript as it is the most suitable language when times come to DOM parsing and DOM navigation.
How to start the Set Up
In this article, I will discuss the Puppeteer from the Google Chrome team. It is the unofficial wrapper, also called the Pyppeteer. It allows you to control your Google browser with the help of the Developer Tools Protocol interface. At first, you should download and install the miniconda for your Operating System from the URL – https://docs.conda.io/en/latest/miniconda.html. You can also install the Jupyter and use it as the coding environment. Then install the Pyppeteer and then after installing it, if you face some bugs, you must require to move to the older version.
Start the Chrome in the Debug Mode
After you have set up the Pyppeteer and the Jupyter correctly set up, you can then start Chrome in the debug mode and to allow the script control. Firstly, you should create the separate user profile for holding the Google search console account user data and the password.
Some Basic Automation Steps
Now, when you have installed the browser in the debug mode, you are logged in to the Google Search Console. Then you can automate the URL Inspection tool. Some of the actions that require automation are-
Selecting the Elementors
For clicking the web page elements and extracting the contents, you have to provide the location in the parsed DOM. You can do it with the help of the XPaths. You can also address the elements that use the CSS selectors directly. Right click to the element that you want to select click on the “Inspect Element” option located in the Developer Tools’ Element view right-click it again under the Copy section and click the JS path.
Here the list is given together-
JavaScript Extraction
Next, the JavaScript function will be created with all the JS paths that are used for the extraction. This function will be passed to Chrome for executing the target page and will get back to the Python dictionary. Most of the data extraction selectors have several options.
Now, when you have completed the preparatory steps, you will then start to the automation process. Create the list of various URLs that you want to check the automation. At first, the automation process is quite slow and resource intensive, and you have to log in to the Google account. The coolest part is to watch the browser typing the URL for checking, character wise.
Performing the Analysis
After the Google Chrome and the Pyppeteer have done their task, you should do the data indexing for the URL that we can go through in the list of the dictionaries. You can also convert them to the pandas data frame. When the data is in the pandas, you can select them easily to isolate the main reasons for finding out the pages from the index.
Thus, this guide can be helpful to know about how you can automate the URL Inspection tool with the help of Python and JavaScript. To find out other resources on this topic, you can follow our blogs.