Archive

Posts Tagged ‘configuration’

BetterWPSecurity – a great WordPress plugin but proceed with caution

February 19th, 2013 No comments

I’ve recently installed the BetterWPSecurity WordPress plugin, and found that while it’s very useful and does increase the security of WordPress it can also break your site.

Ah, Monday morning and the start of my three months paternity leave looking after my six month old son Zach. During his morning nap I logged into my blog to work on an article and noticed that my blog wasn’t loading articles correctly even though the home page worked just fine. Investigating further and looking at my site stats (I use both the Jetpack plugin and Google Analytics) clearly showed that something broke at the start of the weekend – I had nearly no traffic all weekend. Having just referred a colleague to my site for some information and on my first day of paternity leave (ie less time on my hands, not more as some may think) this was definitely not ideal timing!

My first step was to check my logs for information, in this case the BetterWPSecurity log for changed files. This revealed that the .htaccess file in the root directory was changed late on Friday night at 11:35pm – and I knew that wasn’t me as I was tucked up in bed. My first thought was a hack as the .htaccess file permits access to the site but there was no redirect or site graffiti and the homepage still worked so that didn’t seem likely. I logged in via SSH to have a look at the .htaccess file but didn’t see anything obvious although I’m no WordPress expert.


My priority was to get the blog working again so I tried restoring a copy of the changed file from the previous week’s backup (made via the BackWPUp plugin) only to find the backup wasn’t useable. Bad plugin! Luckily I’m a believer in ‘belt and braces’ and I knew my hosting company, EvoHosting, also took backups. I logged a call with them and within the hour they’d replied with the contents of the file from a week earlier. Sure enough the file had been changed but looking at the syntax it appeared to be an error rather than malicious hack.

My .htaccess file when the site was working;

# BEGIN WordPress

RewriteEngine On

RewriteBase /

RewriteRule ^index\.php$ - [L]

RewriteCond %{REQUEST_FILENAME} !-f

RewriteCond %{REQUEST_FILENAME} !-d

RewriteRule . /index.php [L]

# END WordPress

My .htaccess file after the suspicious change;

# BEGIN Better WP Security

Order allow,deny

Allow from all

Deny from 88.227.227.32

# END Better WP Security

RewriteBase /

RewriteRule ^index\.php$ - [L]

RewriteCond %{REQUEST_FILENAME} !-f

RewriteCond %{REQUEST_FILENAME} !-d

RewriteRule . /index.php [L]

</IfModule>

# END WordPress

I backed up the suspicious copy of the file (for future reference, ie writing this blogpost), restored the original et voila – the blog was working again. Step one complete, now to find the root cause…

Part of any diagnostic process is the question ‘what’s changed?’ and I had a suspicion that BetterWPSecurity could be the culprit as I’d only installed it a few weeks earlier. There was also the obvious issue of the new code in the .htaccess file which looked to belong to BetterWPSecurity. I checked the site access logs which confirmed my hypothesis – someone had attempted to break into my site and while attempting to block the attacker BetterWPSecurity had mangled my .htaccess file. The logs below have been truncated to remove many of the brute force login attempts (there were plenty more) but note that on the final line (after BetterWPSecurity has blocked the attacker) the HTML return code was 418 (“I’m a teapot”) rather than 200 plus the suspect IP 88.227.227.32 is the same as the one denied in the mangled .htaccess file. Yes, you read that right, “I’m a teapot”! Here’s a full explanation for that April Fool’s error code. :-)

88.227.227.32 - - [15/Feb/2013:23:35:19 +0000] "POST /wp-login.php HTTP/1.1" 200 3017 "http://www.vexperienced.co.uk//wp-login.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"
88.227.227.32 - - [15/Feb/2013:23:35:19 +0000] "POST /wp-login.php HTTP/1.1" 200 3017 "http://www.vexperienced.co.uk//wp-login.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"
88.227.227.32 - - [15/Feb/2013:23:35:19 +0000] "POST /wp-login.php HTTP/1.1" 200 3017 "http://www.vexperienced.co.uk//wp-login.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"
88.227.227.32 - - [15/Feb/2013:23:35:19 +0000] "POST /wp-login.php HTTP/1.1" 200 3017 "http://www.vexperienced.co.uk//wp-login.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"
88.227.227.32 - - [15/Feb/2013:23:35:19 +0000] "POST /wp-login.php HTTP/1.1" 418 5 "http://www.vexperienced.co.uk//wp-login.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"

So BetterWPSecurity led me to the fault but also caused it. To be fair the plugin does warn you which settings are potentially going to cause issues but I’d assumed that it wouldn’t be me – dangerous things assumptions. I’ve rectified the issue by restricing BetterWPSecurity from altering core system files as shown in the screenshot below;

My blog is fixed and I’m feeling quite chuffed that it was all resolved during a long lunchbreak – not a bad day’s work if I do say so myself! Lesson for today? Take warnings seriously and have multiple backups!

Categories: VMware Tags: , ,

Federated login failures – the LSA cache

September 5th, 2012 No comments

While working recently on an ADFS federation solution I came across a Microsoft ‘feature’ which doesn’t seem to be well known and which caused me to deliver my project a week late. It often manifests itself via failed logins and affects many products which integrate with AD such as Sharepoint, Office365, OWA, and of course ADFS. This is very much one of those ‘document it here for future reference’ posts but hopefully it’ll help spread the word and maybe save someone else the pain I felt!

To describe how the ‘feature’ affects ADFS you need to understand the communication flow when a federation request is processed. The diagram below (from an MSDN article on using ADFS in Identity solutions) shows a user (the web browser) connecting to a service (the ASP.NET application although it could be almost any app) which uses ADFS federation to determine access;

Communication flow using federated WebSSO

Summarising the steps;

  • The user browses to the web application (step 1)
  • The web app redirects the user to ADFS (step 2,3)
  • ADFS attempts to authenticate the user, usually against Active Directory (step 4)
  • ADFS generates a token (representing the users authentication) which is passed back to the user who then presents it to the app and is given access (steps 5,6,7)

My problem was that while some users were being logged into the web application OK, some were failing and I couldn’t work out why. Diagnosing issues in federation can be tricky as by its nature it often involves multiple parties/companies. The web application company were saying their application worked fine, both redirecting users and processing the returned tokens. The users were entering their credentials and being authenticated against our internal Active Directory. ADFS logs showed that tokens were being generated and sent to the web app. Hmm.

Digging deeper I found that the AD username (the UPN to be precise) being passed into the token generation process within ADFS was occasionally incorrect. The user would type their username into the web form (and be authenticated) but when ADFS tried to generate claims for this user via an LDAP lookup it used an incorrect UPN and hence failed. It seemed as if the Windows authentication process was returning incorrect values to ADFS. This stumped me for a while – how can something as simple and mature as AD authentication go wrong?

Of course it’s not going wrong, its working as designed. It transpires there’s an LSA cache on domain member servers. On occasions where the AD values have changed recently (the default is to cache for 7 days) it can result in the original, rather than the updated, values being returned to the calling application by the AD authentication process. A simple change such as someone getting married and having their AD account updated with their married name could therefore break any dependant applications. Details of this cache can be found in MS KB article 946358, along with the priceless statement “This behaviour may prevent the application from working correctly“. No kidding! This impacted my project more than most because the AD accounts are created programmatically via a web portal and updated later by some scripts. The high rate of change means they’re more susceptible to having old values cached.

This might seem like a niche problem but it also impacts implementations of Sharepoint, OWA, Project server, and Office365 – any product that relies on AD for authentication. These products can be integrated with AD to facilitate single sign on but if you make frequent changes to AD the issues above can occur.

How can I diagnose this issue?

The symptoms will vary between products but thankfully Microsoft have some great documentation on ADFS. The troubleshooting guide details how to enable the advanced ADFS logs via Event Viewer- when you’ve got those check for Event ID 139. The event details shows the actual contents of the authentication token so you can check the UPN and ensure it’s what you expect. If not follow the instructions in the KB article to disable or fine tune the cache retention period on the domain member server (ie the ADFS server, not the AD server).

Further Reading

Understanding the LSA lookup cache

Using vCenter Operations v5 – Capacity features and conclusions (3/3)

April 16th, 2012 2 comments

In the first part of this series I introduced vCOps and it’s requirements before covering the new features in part two. This final blogpost covers the capacity features (available in the Advanced and higher editions) along with pricing information and my conclusions.

The previous trial I used didn’t include the capacity planning elements so I was keen to try this out. I’d used CapacityIQ previously (although only briefly) and found it useful but combined with the powerful analytics in vCOps it promises to be an even more compelling solution. VMware have created four videos with Ben Scheerer from the vCOps product team – they’re focused on capacity but if you’ve watched Kit Colbert’s overview much of it will be familiar;

UPDATE APRIL 2012 – VMware have just launched 2.5 hrs of free training for vCOps!

If you don’t have time to watch the videos and read the documentation (section 4 in the Advanced Getting Started guide) here’s the key takeaways;

  • Capacity information is integrated throughout the product although modelling is primarily found under the ‘Planning’ view. Almost every view has some capacity information included either via the dynamic thresholds (which indicate the standard capacity used) or popup graphs of usage and trending.
  • Storage is now included in the capacity calculations (an improvement over CapacityIQ) resulting in a more complete analysis. Datastores are now shown in the Operations view although if you’re like me and use NFS direct to the guest OS it’s not going to be as comprehensive as using block protocols.
  • the capacity tools require more tailoring to your environment than the performance aspects but provide valuable information
  • With vCOps you can both view existing and predicted capacity and you can model changes like adding hosts or VMs.

Read more…

Using vCenter Operations v5 – What’s new (2/3)

April 12th, 2012 No comments

In part one of Using vCenter Operations I covered what the product does along with the different versions available and deployment considerations. In this post I’ll delve into what’s new and improved and in the final part I’ll cover capacity features, product pricing, and my overall conclusions. I had intended to cover the configuration management and application dependency features too but it’s such a big product I’ll have to write another blogpost or I’ll never finish!

Introductory learning materials

UPDATE APRIL 2012 – VMware have just launched 2.5 hrs of free training for vCOps.

Deep dive learning materials;

What’s new and improved in vCOps

Monitoring is a core feature and for some people the only one they’re concerned about. As the size of your infrastructure grows and becomes more complex the need for a tool to combine compute, network, and storage in real time also grows. Here are my key takeaways;

  • there’s a new dashboard screen which shows health (immediate issues), risks (upcoming issues) and efficiency (opportunity for improvements) in a single screen. The dashboard can provide a high level view of your infrastructure and works nicely on a plasma screen as your ‘traffic light’ view of the virtual world (and physical if you go with Enterprise+). The dashboard can also be targeted at the datacenter, cluster, host or VM level which I found very useful although you can only customise the dashboard in Enterprise versions. There is still the Operations view (the main view in vCOPS v1) which now also includes datastores. This view scales extremely well – even if you have thousands of VMs and datastores across multiple vCenters they can all be displayed on a single screen.
    NOTE: If you find some or all of your datastores show up as grey with no data (as mine did) there is a hotfix available via VMware support.
  • Read more…

Using vCenter Operations v5 – Introduction and deployment (1/3)

April 10th, 2012 No comments

At VMworld 2011 in Copenhagen VMware unveiled a significant revamp of their management suites, including a new version of vCenter Operations Manager (v5 to align with the vSphere release). vCenter Operations is now a suite of tools which includes vCenter Configuration Manager, the new vCenter Infrastructure Navigator (which I’ll cover in a later blogpost) and vCenter CapacityIQ (which is now fully integrated into vCOps, the standalone CapacityIQ is now end of life).

Although announced at VMworld it wasn’t publicly available until Jan 2012 when VMware formally launched vCOps v5. Coming less than a year after the release of the first version it’s apparent that VMware see this as an important product which is evolving fast. Steven Herrod, VMware’s CIO stated recently at the Italian VMUG (around the 5 minute mark) that vCOps ‘is becoming the most adopted new technology that VMware has ever had’. The vCenter Operations suite is still aimed at infrastructure monitoring as opposed to application monitoring (despite the addition of Infrastructure Navigator) – VMware’s solutions aimed at the application tier belong to the vFabric suite. For a good overview of where vCOps and vFabric Hyperic fit into VMware’s cloud suite read Dave Hill’s blogpost on the subject.

If you aren’t familiar with vCenter Operations here are the kind of problems it aims to address;

  • Is your virtual infrastructure healthy?
  • What serious problems should I address immediately?
  • Is the workload in my environment normal?
  • Am I using the resources in my environment efficiently?
  • How long do I have before resources run out?
  • What impact did a recent change have?

A few people have already posted articles which I’d recommend reading;

With v1.0 I concluded that it was a great product but there were a few reasons why it wasn’t for me, primarily the lack of email notifications and pricing. In this post I’ll cover the requirements and deployment considerations for the new version and in part two I’ll cover day to day use and new features. The final part will cover the capacity features along with info about pricing and my conclusions.

UPDATE APRIL 2012 – VMware have just launched 2.5 hrs of free training for vCOps.

Read more…