WordPress is one of the biggest open-source CMS (Content Management System) used on the internet to date. It’s written in PHP, but even with its concerns in security in both the CMS and the PHP language, it still remains to be one of the easiest and most highly customizable CMS engines out there. The main reason for its popularity is its ease of use and tons of customization options that it provides in the form of plugins and themes.
Today we will look into how WordPress security works and some of the science behind popular WordPress security scanners like “wpscan”. I will be closely looking into each security problem and write scripts to automate most of the scanning and reconnaissance process via python. This post assumes that you have some basic understanding of WordPress, but you don’t have to be a WordPress wizard, who has contributed tons and tons of patches and updates to the WordPress engine and or its plugins.
All the resources like scripts and the WordPress installation used in this post will be available on Github. This will be a two-part blog post on the Appknox security blog, so stay tuned for the upcoming one where I will discuss things like creating a custom WordPress vulnerability database from scratch as well as other stuff 😉.
Before we even start doing an exploitation on any WordPress based websites, we need to figure out if the backend used in the website/webapp is even WordPress. Makes sense right? Anyways you might be thinking how can i find if the backend is WordPress or not? Well let’s look at a few methods of finding a WordPress backend.
So all WordPress websites have an admin dashboard which allows the site administrator to manage the site’s look and feel, the posts and other things like plugins and stuff. Now of course you won't be able to get access to an admin dashboard that easily without proper credentials, but we don't need that either way. We need to only figure out if the dashboard is even present.
This is how a typical WordPress admin dashboard looks like. It may differ in really advanced and complicated sites, as they tend to change a few color schemes here and there but in general this is how it looks 90% of the time.
Typical WordPress Dashboard
Now wait, this is the logged-in view of the dashboard. You won’t be seeing this until you crack the password and username of the site owner like John Wick! We aren’t interested in those things right now. What we are interested in is the login page itself. Let me explain…
So to get here you have to enter “/wp-admin” in front of the website domain. This is how a typical login screen of a WordPress dashboard looks like.
Now again depending upon website to website, this page may vary. Sometimes they fancy up the CSS of the page and have the website logo instead of the WordPress one on top. But this is how it looks in general. Once we have this, we can confirm that the website is running on WordPress. Before I move on from this method I would like to clarify one thing and that is the login page remapping. This simply means the site administrator has purposely moved the page to some other route in order to avoid information gathering by a potential attacker.
The “/wp-admin” route itself is a redirect to “DOMAIN_NAME/wp-login.php?redirect_to=”. So if the site administrator has moved the page, you can type “wp-login” directly in order to get to the login page. Additionally you can also use dirsearch, gobuster etc to find the login page if you want.
If the above method doesn’t work and you aren’t able to locate the admin dashboard, it's ok, life ain’t easy and I understand that. So let’s move to the second method.
WordPress Related Asset Loading
Now this is pretty simple to execute and it ain’t no rocket science. This technique is simply monitoring the website traffic and checking for common WordPress directories like “wp-content”,”wp-uploads”, “wp-includes”, etc. Now we don’t need an intercepting proxy like BurpSuite, MiTMProxy etc, we can simply use the browser’s developer tools to do this. Let’s look at an example.
This is how a typical WordPress website loads content to load the assets and other scripts used by the website for its full functioning.
So the above two screenshots are from my browser’s developer tools. You can see in the two above requests both the assets are loaded for the WordPress specific directory i.e “wp-content” and “wp-includes”. This indicates that this site is using WordPress since these are WP specific directories.
Now this is mostly a full proof method of finding a WordPress based backend, but sometimes this method can also fail. Since the site administrator can move the assets to a separate CDN or change the directories from which it gets loaded.
If this doesn’t work, don’t worry we have another trick up the sleeve which can be used to find the website that is running WordPress.
WordPress REST Endpoint
Now to understand this we will have to look at what WordPress REST Services are. In short WordPress in version 4.4 started the implementation of the REST service that allows the WordPress installation to run in headless mode and serve data from REST Endpoints. So you can have your own blog app and connect the WordPress JSON Endpoints to your app and directly fetch data from your blogs, etc.
Now by making a request to one of many WP REST endpoints we can figure out whether the website is running on WordPress or not. Now WordPress follows a typical routing convention for the REST APIs. It is “https:///wp-json/wp//comments/”. The version is typically “v2”. Making the request to valid JSON REST endpoint gives the following information.
The following image shows a typical response of a WordPress JSON REST endpoint in a web browser.
So even this can be a detection factor for a WordPress based backend. Even though some WordPress JSON API can be disabled by the site administrator, most installations of WordPress do have this enabled in production mode.
There’s a bunch of WordPress API endpoints and we will look into some of them when we get to the information gathering & reconnaissance stage of this blog post.
Now there are more ways which can be used to detect if a website is running on WordPress backend but these are the most common ones and mostly used by proprietary scanners as well. You can freely use your own methods as well and find the backend.
Once we have detected that the website is running WordPress, we can start the information gathering and reconnaissance process to find information about the WordPress installation. This helps us channel the attacks much more precisely.
Now there are a lot of ways we can do the recon process on WordPress but this list can get exhausting. It's totally up to the attacker on how he/she wishes to carry out this process. There are a lot of steps you can skip if you are not trying to hunt that specific thing. For ex: if you aren’t interested in the XMLRPC detection and wanna focus on WordPress Version CVEs and Plugin vulnerabilities, then you can skip everything except those two.
Anyways, enough talking let’s look at some methods of conducting WordPress reconnaissance.
Enumerating WordPress Version
Now this is by far the most important step you might wanna conduct in the recon phase. This allows to figure the vulnerabilities directly in the WordPress system (CVEs in short). We will be going in-depth about finding CVEs much more faster and efficiently in the next blog part, but for now let’s look at the some of ways you can figure out the version of WordPress running on the server.
HTML Source Of Dashboard
This is by far the most trusted way to access a WordPress installation version. This step is simple. If you remember the admin dashboard we talked about the detection phase, it’s technically the same thing. Navigating to a WordPress dashboard and opening the source of the page in this browser by hitting right click and selecting view page source give the following information.
If you look at the above image (second one) you will see certain links which load the CSS files of the WordPress dashboard. These are WordPress’s own files and if you clearly look at the end where it says ?ver=5.9.3, that’s the version of WordPress we are dealing with here. So with this we can conclude this is WordPress version 5.9.3.
See I said it was simple 😂. This is one of the easiest and proven ways of finding a WordPress version. But hey we won’t be stopping at this. Remember I said we will be automating the entire process of attack and recon in the start? Yeah, so we are gonna start doing that from now.
Let’s pull up the text editor and write a python script to detect the version of WordPress using this source code method.
The code above is pretty simple. We use the requests library to make a request to the admin dashboard URL and get the html of the page. We then use the python’s in-built HTMLParser to read the HTML and extract the “<link>” attribute from the HTML data.
We then extract the “href” part of the “<link>” and then pass it to a function called process_version(). This function simply tries to find a string in the array with a “?”, since we know the CSS file link has “?ver=version_number”. Splitting at the “=” and grabbing the second object of the array we get the version of WordPress. Executing the script will give the following result.
Ohkk, now i know what you might be thinking 😅. Why is it printing the version so many times? Yeah I know, we will do the fixing later. This is just an automation PoC. Alright with this sorted let’s move on to the second way to find the version.
Ok so this method as the previous is also quite simple as well. This method reads the /feeds.xml present in the WordPress RSS feed system to the version of the running installation. All you have to do is add a /feeds in a WordPress based website and it will return an XML document with a lot of stuff but we ain’t interested in those. We are only interested in the “<generator>” section which displays the version of WordPress.
Alright, let’s look at an example. The following image shows the feeds XML document on a running WordPress site.
If you focus on the “<generator>” section of the above image, you will find something similar to the HTML source of the WordPress dashboard. The number after the “?v=” is the version of WordPress the server is running currently. Alright so with that out of the way let’s write a python script to automate this as well.
In the above screenshot you can see the python code to extract the xml tag from the feeds.xml url of the WordPress site (in this case it’s downloaded to my filesystem). Well nothing much to explain here, it’s pretty simple. The python code simply reads the xml file from the WordPress site and iterates over the XML tree to find the keyword “?ver=” Once found the data printed out.
Alright, at this point you might ask are there other methods of finding WordPress versions. Well yeah, there are, so let’s proceed further.
Source Of Blog Front Page
Well, this method is exactly the same as the first one we saw. Instead of the admin dashboard, we are looking at the blog/website itself this time. Alright, let’s see how it’s done.
Let’s visit my blog “electrondefuser.com” and look at the source of the webpage. Now we need to look at the content that loads from things which are WordPress specific like “/wp-content”, “/wp-uploads” etc etc.
So I picked this specific one css file (randomly ofc!) and if you look at the end of the css url you can see “?ver=”. The number after that is the version of WordPress the server is running. See I told you it’s easy. The automation code for this one can also be the same as the first one (the admin dashboard one).
Alright. Is it done? Well no, I have one more technique up my sleeve to detect a WordPress version. But bear in mind that this is supposed to be the last resort and it’s better if this process is automated. So let’s take a look at the final way.
Fingerprinting WordPress Public Files
Alright, so this is the last trick we have up our sleeve to find the WordPress version. This method basically checks the public files of WordPress with the actual release version archive which can be downloaded from the website.
Let’s take a public CSS file of WordPress. I’ll take the “dashicons.min.css” which you can see is available from the HTML source code in the above methods. Now I'll download the WordPress software archive from their website. Since I know that this is a modern blog, I can safely assume that the version would be somewhere from 5.7 - 6.0 (the latest one). Now for the sake of this example, I’ll demonstrate on 5.8.4, 5.9.3, 6.0, and 5.7.6.
Alright there you go, I have them downloaded successfully. Fine, now let’s go to the target and WordPress installation and check some of its public files.
Ok so we can see 4 files in this screenshot. We will take the top two (randomly ofc! You can choose whatever you want) i.e dashicons.min.css and buttons.min.css. Now let’s open the downloaded WordPress archive and extract these files from there. Since we know the files come from “wp-includes/css”, we can open that directory and extract the files from there.
Alright so i have extracted both the files of WordPress in separate directories named after their version and each folder has those files in it. This is how the arrangement looks like.
Alright, we are done with this part, now let’s download the files from the server via “wget” and save it locally because we need those to compare the hashes to find the correct version.
Alright so now we have the server files as well. For the sake of simplicity, we will rename these to files and add “-server” before the file extension to understand that these are files from the target installation. Alright now let’s move on to the final show now, Let’s compare the hashes of all these downloaded versions with the ones we got from the server-side.
I’ll use a command called “md5sum” in linux to create a hash of the files, but you can use whatever you want, it doesn’t matter.
The above screenshot shows the MD5 hashes of the files we got from the server (the target WordPress installation).
Alright, so here are all the hashes of the WordPress public files that we wanted. You can clearly see both the hashes of the files from the server and the local ones match with the version 6.0 installation. So here we can assume that the WordPress installation running on the server is version 6.0.
Here I would like to mention one thing before we proceed and that's the similar hashes of some files. If you look at the file “dashicons.min.css” in the version numbers “5.7.6” and “5.9.3”, you will find that the hashes are the same as the target ones but the other css file has a different hash. This is mainly because some files in WordPress don’t get changed once they are stable. This is why this method may not be perfect every single time.
The best way to execute this method might be to create a full hash library of all the public files of WordPress and compare them with the ones that are almost unique.
Enumerating WordPress Plugins
So once we have our version detection, let’s move on to some more serious stuff on WordPress. This method describes enumerating plugins. WordPress plugins as the name suggests are pieces of code that adds or extends the functionality of WordPress. Plugins are mostly written in PHP via some APIs provided by WordPress.
But the million-dollar question is “Why should you care about finding WordPress plugins?”. Ok so let me explain. Most vulnerabilities found in WordPress right now are mostly found in plugins. Things like SQLi injection, XSS etc are mostly found in plugins which in turn exploits the blog directly. So yeah an outdated plugin version can expose the website to a potential attack. So this is an extremely important step to find what plugins are used in the WordPress installation.
Alright so let’s start the process. The way WordPress plugins are installed is by creating a folder on the server-side in the “wp-content” folder with the plugin slug name. A slug name is nothing but a unique name given to a plugin so that it can be stored in unique folders on the server-side. So let’s try to see how this works.
Let’s go to my blog and try to enumerate the plugins installed there. Since I told you that plugins are stored in the “wp-content” folder, let’s try to look for plugins available there. This is what a typical plugin directory looks like.
So now since we know the trick let’s try to do it. Let’s look at the two screenshots below.
The first image shows a “Not Found” or a “404 Response” from the server because I am trying to access a plugin that doesn’t even exist on the server. With this we can conclude that the plugin is not installed in the target WordPress installation.
The second image shows a plugin that gives out a “Forbidden” or “403 Response”. This means the folder exists but we aren’t allowed access to that. Well we don’t need access too, because with this we can conclude that the plugin is installed. Now with this information, we can search the internet about the plugin to find CVEs and try to target them accordingly.
But here’s an issue that you might face. As of writing this article, there are more than 50K WordPress plugins. How would we find all the slug names for all those 50K+ plugins in the store? No one wants to go search everything in the WP store and write everything down right? So to counter this we will use the official WordPress API to gather all information about the plugins.
Now I'll only demonstrate how to consume the API and gather data from it and in the future, we will try to automate the entire process of enumerating the plugins.
The base domain of the WordPress API is “api.wordpress.org/plugins/info/1.1/”. Now if you want to you can send a GET request by adding two extra parameters in the request which are “action=query_plugins” and “request[page]=”. These two parameters tell the backend that you want to query plugins and the page parameter controls the current page of plugins being displayed.
This is what a typical API response looks like. Now there are a lot of ways the output can be cleaned by using additional parameters to exclude unnecessary response data. By incrementing the “request[page]” parameter in the URL, the next set of plugins will be displayed in the response. The below link can be used for knowing more about the API and types of parameters that can be used in it.
Alright, this is an interesting one. So XML-RPC stands for Extensible Markup Language Remote Procedure Call. It is a system that allows WordPress to communicate to other services with XML being the data type of transmission. Since WordPress isn’t a self-enclosed system and occasionally needs to communicate with other systems, this was sought to handle that job.
But you might ask why is this interesting at all? Well, great question! XML-RPC is filled with security flaws that can be used to conduct a lot of exploits and reconnaissance. But before we explore exploitation, let's figure out how to detect if XML-RPC is enabled and get a list of methods that are enabled in the target installation. Methods are nothing but functions that are present in XML-RPC. Mostly all are enabled by default but the maintainer might disable some depending on use case.
To check the existence of XML-RPC, just add “/xmlrpc.php” at the end of the target WordPress URL. You should get a response something like this.
The above image says that XML–RPC needs a POST request to be sent. If you get such a response that means the XML-RPC is enabled on the server. Now let’s try to send a POST request to the same URL and check the response.
Ok so we get a response. This proves XML-RPC is working. Before we proceed, let’s try to get all the enabled methods in XML-RPC. Use the following code shown below in the screenshot to find all the enabled XML-RPC methods on the target installation.
By sending the above response you will get the following response.
The “wp.[method-name]” are the methods that are enabled and available on the target installation of WordPress. We will get back to this once we start the exploitation of XML-RPC, but for now, we have XML-RPC enabled on the server and that increases our attack surface.
Enumerating WP-JSON Rest Services
So this method basically is trying to gather as much information as possible via the WordPress’s JSON API which is used for its headless operation. WordPress provides a REST API service for managing blogs and stuff without actually using its dashboard etc.
This is also used by app developers who want mobile apps for their blogs etc. The blog maintainers can use this REST service to get the data directly from their WordPress backend. But hey you might ask how is this even helpful to us? Well, I am glad you asked. There are a lot of endpoints we can use to gather information about the target WordPress installation.
We will cover most of it in the future and we will see one in action in the “Enumerating Users” section but for now, I’ll show how you can use API services in the first place. Now in very few cases (if you are really unlucky!) the blog maintainers completely disable the REST services altogether but in most cases, it’s always enabled.
The WordPress REST services follow a typical routing pattern and this is something like this
After the “v2” comes the actual methods and routes used for gathering data. A typical WP REST response looks something like this.
Ok so we will stop here and continue to look at this in the later sections but if you are interested to know more about the REST API, the below link will take you to the API handbook of WordPress and that will serve as a great resource for anyone who is a security researcher or blog maintainer.
Alright, so this is an interesting one. Enumerating the usernames/users in the blog. This will technically help us brute-forcing in general because usernames are the ones we need to log in inside the dashboard. Moreover in some CTFs (Capture the Flag) challenges you need the username in order to complete the challenge as well.
Alright, I guess this is simple since the only goal is to gather the username. There are 3 ways by which you can conduct user enumeration. Let’s discuss each of them briefly.
Reading via WP-JSON REST
I hope you remember the WordPress JSON API we talked about in the last section right? Well, we will put that into action right now. So everything in the format remains the same except we will add “/users” route at the end to grab all users.
This is what a typical REST response from the user route looks like. Bear in mind this is just a snippet of information. You will have a lot more information than this (sometimes even email!!) but I won't be showing you at the level since that would maybe be privacy-invading.
Most of the time the REST API will be active and blog maintainers might leave this open. If this doesn’t work for you, we do have other methods but this is one of the easiest to get by.
Scanning Yoast SEO
This is another cool trick that we can use to exfiltrate information of the WordPress users. Well, this method relies on the site owner having a plugin named “Yoast SEO”. What is Yoast SEO you might ask? Well from their own website we can get this information
“Yoast SEO is a search engine optimization plug-in for WordPress. The plugin has five million active installations and has been downloaded more than 350 million times”
Well in short it’s just an SEO plugin for WordPress used for making the blog search engine friendly and stuff. But how can we use this to our advantage you might ask? Yoast SEO creates something known as an “author-sitemap” which is basically an indexing mechanism to make specific posts that are getting popular to make it to the top of the search engine results.
Turns out that the sitemap contains the author names as well and we can easily use it to our advantage to find the username. For this, the only thing we need to do is add an “/author-sitemap.xml” at the end of the base domain URL.
This is what it looks like. You can see “/author/elctrondefuser” in the table. Anything after the author is what the actual username is. Well, that’s it. Easy right? But in some cases, this might be a solution if Yoast SEO is not installed on the target WordPress installation. Well if that’s the case then we have one more trick to find the users. This is kinda tricky but if executed properly it can work.
Author ID Brute Forcing
Well this is a pretty easy trick but people tend to confuse this very often and also some CDNs and firewall services like Cloudflare tend to block multiple attempts of requests from a single user in a short span of time.
The author id is a unique identifier that is used by WordPress to identify the author details in the backend. This can be accessed via a route which is “/?author=” adding the ID will redirect you to the author’s posts.
To monitor the correct author ID we can check the response code of the request. A “200” code means the request was successful and “404” means not found and we can conclude that author ID doesn’t exist.
In the above two images, you can see the curl requests going through. The first one of course is 200 status code which means the author was found and the second one is 404 which means it doesn’t exist.
The “-i” in curl tells it to display headers and “-L” says to follow the redirect. That's why in the first request there are two headers. One being 301 which is technically a redirect and the second is 200 which means the redirected resource was found.
So this is an optional step but it can sometimes be useful to crawl and create a full sitemap to find out certain posts that might be private information. Some lifestyle bloggers do share things about their daily routines in blog posts and in some cases reveal certain things they like in their lives like music albums, their pet’s name, etc. This can be used as potential password options too because people tend to keep passwords about things to like and remember.
Now I know this is going towards the Red-Team operation route but these things can be useful in conducting full pentesting of WordPress blogs. Now you can obviously go over each page and find the information one by one painfully or you can use the below two methods to efficiently conduct crawling.
Scanning the Sitemap File
This is a simple process again but relies on the same plugin which is Yoast SEO that we used in the user enumeration part. The method is pretty simple, we add a “/sitemap.xml” at the end of the domain URL and you will be presented with very basic table-looking data that contains all the links inside the blog.
You can download and analyze it and call it a day but something the site owner might limit the number of entries in the sitemap.xml by editing the Yoast settings. If that’s the case then you can use the second method in the list.
Via WP-JSON REST
Well Surprise, we have our good old friend, the REST service of WordPress yet again to help us. If you are lucky and find the REST service is enabled in the backend, you can simply make requests to the “/posts” route and gather information page by page very easily.
You can even search comments in each post via the post id and further deepen your reconnaissance. This will help to find the users commenting about stuff and other blogs or links which can be beneficial information depending upon your depth of analysis.