I Analyzed the Top 1 Million Sites to Find the Top WordPress Themes, Plugins & More

When I started on this project, I wasn’t planning to compile a list of WordPress Statistics. The initial goal was only to get a list of the most popular premium WordPress themes. The idea was to try to do a speed test of the top premium themes (something I will do as soon as I find a way to acquire them all).

Once I started collecting the theme stats though, I got a bit carried away. Since I was already going to crawl a million sites, I figured should probably collect as much info as possible.

Methods
I started out by downloading the top 1 Million sites from Alexa. I removed the rankings and added “http://” to the beginning of each. The result was a big file with one url per line.

Next, I set up several custom extractions in Screaming Frog. One for each piece of data I wanted to collect. These were all regex patterns looking specifically for WordPress sites.

I then fed the top 1M sites (broken up into 20 files of 50,000 lines each (screaming frog, please make a command line utility) ) into Screaming Frog.

(Note: I also experimented with my own simple crawler as well as a python crawler called scrapy, but I wasn’t able to get anywhere near the speed of SF)

The end result was 20 spreadsheets filled with some sweet, sweet WordPress data.

Results

1) How Many of the Top 1 Million Sites Use WordPress?

Keep in mind, I could only collect data from sites that have WordPress-specific strings. Things like /wp-content/ or ...meta name="generator" content="WordPress.... A lot of security plugins remove this kind information because it can make you an easy target for hackers.

In the end, I was able to identify 192,681 sites out of 1,000,000 using WordPress.

WordPress Usage Statistics for Top 1M Sites

2) Which WordPress Versions Are Most Popular?

Of all the data I collected, this definitely what surprised me most. First, I was shocked to see how many people haven’t updated WordPress. 26% of the sites showing a version have an old version of WordPress. I mean, seriously, people!! Update!!

The second surprising piece of info here is that only 40% of sites are hiding their WordPress Version. When it comes to securing WordPress against hackers, hiding your WordPress version is one of the low hanging fruits. Hackers target sites using older versions of WordPress. If you don’t show your version info, you become a harder target to find. Even if you are running an old version of WP.

Top 10 WordPress Versions for the Top 1M Sites

Top 25 WordPress Versions found

  • 62364 WordPress 4.7.3
  • 9473 WordPress 4.6.4
  • 7567 WordPress 4.7.2
  • 5472 WordPress 4.5.7
  • 3948 WordPress 4.4.8
  • 3106 WordPress 4.6.1
  • 2321 WordPress 4.3.9
  • 1762 WordPress 4.2.13
  • 1405 WordPress 4.5.3
  • 1144 WordPress 4.4.2
  • 1123 WordPress 4.1.16
  • 881 WordPress 4.3.1
  • 782 WordPress 4.5.2
  • 677 WordPress 4.0.16
  • 625 WordPress 3.9.17
  • 579 WordPress 4.2.2
  • 569 WordPress 4.7
  • 478 WordPress 4.6
  • 473 WordPress 3.5.1
  • 385 WordPress 4.0
  • 364 WordPress 3.8.19
  • 352 WordPress 4.4.1
  • 352 WordPress 4.7.1
  • 327 WordPress 4.1.1
  • 313 WordPress 4.5.4

Which WordPress Themes Are Most Popular?

I honestly expected to see more free themes in the top 10, but it turns out the most commonly used themes were premium.

Top 10 Themes Used by the Top 1M Sites

Top 25 WordPress Themes found

  • 4069 Newspaper
  • 2765 sahifa
  • 2628 Divi
  • 2387 Avada
  • 1403 enfold
  • 1400 hueman
  • 1229 genesis
  • 1202 twentytwelve
  • 962 Newsmag
  • 953 twentyfourteen
  • 808 jarida
  • 797 twentysixteen
  • 747 x
  • 745 h4
  • 738 colormag
  • 680 point
  • 664 twentyfifteen
  • 634 salient
  • 599 simplicity2
  • 591 voice
  • 587 optimizePressTheme
  • 584 betheme
  • 571 dt-the7
  • 556 mh-magazine-lite
  • 549 jupiter

Which WordPress Plugins Are Most Popular?

The top plugin on this wasn’t too surprising. I use Contact Form 7 in most of my sites. I believe a lot of premium themes require or recommend it as well.
Top 10 WordPress Plugins Used by the Top 1M Sites

Top 25 WordPress Plugins found

  • 57651 contact-form-7
  • 31131 jetpack
  • 18762 Yoast SEO
  • 12379 wp-pagenavi
  • 10511 wordpress-popular-posts
  • 10404 woocommerce
  • 6823 yet-another-related-posts-plugin
  • 5873 wp-polls
  • 5829 q2w3-fixed-widget
  • 5770 table-of-contents-plus
  • 5721 disqus-comment-system
  • 5558 tablepress
  • 5014 add-to-any
  • 5014 LayerSlider
  • 4765 ie-sitemode
  • 4642 mailchimp-for-wp
  • 4608 captcha
  • 4426 sitepress-multilingual-cms
  • 4075 wp-postratings
  • 3919 cookie-notice
  • 3813 cookie-law-info
  • 3754 wp-rocket
  • 3499 simple-social-icons
  • 3456 wordpress-23-related-posts-plugin
  • 3317 wysija-newsletters

How Many WordPress Sites Are Using https (SSL)?

Only about 21% of WordPress sites I found used https. I plan to look at this statistic for all 1M sites when I get a chance. I’d be curious to see if the number is similar for the rest.

http vs https for WordPress sites in the Top 1M Sites

How Many of the Top 1 Million Sites Use WordPress.com?

I thought this one was interesting as well. I was able to identify 2278 WordPress.com sites (as opposed to Self-hosted WordPress.org) in the top million.

Thanks for Reading. Pleas Share If You Found This Useful
If you have a comment, question, addition, or a request, leave a comment.

Resources Mentioned
Screaming Frog SEO Spider – I really can’t stress enough how useful this tool is. If you’ve got a small site, you can use it for free to check links, study html elements relevant to SEO, create sitemaps, and a lot more.

Scrapy – Scrapy wasn’t really great at crawling links from a list, but it is really useful if you’re interested in crawling the web the traditional sense.

Download all Data
Feel free to download and used this data however you want. In return, please credit AllThingsBlogging.com.
WordPress_top.csv.zip ~6.5MB
Format is Tab delimited
url\tWP Version\tWooCommerce\tTheme\tPlugins (plugins are delimited with ~)

All Plugins grouped with counts ~ 755KB

Share This:

6 thoughts on “I Analyzed the Top 1 Million Sites to Find the Top WordPress Themes, Plugins & More

  1. Rick Viscomi Reply

    This is insightful, thanks for sharing. Do you have the raw data available anywhere for further mining?

    • Badi Jones Post authorReply

      Thanks! And yes. I do have the raw data. I was planning on making it available. I just need to get everything together first. It’s all in a database right now. And the real raw crawl data (text) is pretty messy. I’ll get it all together and put it here in the next day or so. I’m also plan on doing this again in a few months, so if you have any ideas for other pieces of info to collect, let me know.

      If you want the raw data I started with (there is some stuff in there I didn’t import), I can put that up.

  2. Ilya Grigorik Reply

    Badi, great stuff! Curious, how did you detect the plugins? Similarly, do you have the logic you used to detect WP sites shared anywhere?

    • Badi Jones Post authorReply

      I should have mentioned, I’m sure I missed several plugins. I was looking for lines containing this string “/wp-content/plugins”,

      One example of a plugin that missed was Yoast SEO. I had to use a separate pattern to capture that.

      Themes were similar. “/wp-content/themes/…”

    • Badi Jones Post authorReply

      I just added a link to the raw data at the end of the post. There are a some sites that listed wp-total-cache, but I’m sure the pattern I used just didn’t catch that plugin.

      Next time I do this, I’ll make a point to try to add more unique patterns for specific plugins. Like I did for Yoast.

Leave a Reply

Your email address will not be published. Required fields are marked *