online datig

datingsite in usa

guy personals

real swinger videos

wwww sexofree org

swinger utah

wadsworth singles

best dating online services

industrial single board computer

sex book online

rocklin singles

professionals speed dating

adultery com

singles events ideas

filipino freind finder

best internet dating sites

best adult

dating sites new

dating simulator online

sexkontakt

dating service australia

wife swappers

single women looking for sex

adult black

100 swinger

livingston singles

single parent dating website

jewish singles orange county

mature passions

columbian dating

adultfinder com

escort agencies newcastle

swinging lifestyles

little rock singles

match maker au

cheap uk sex lines

no strings sex contacts

bay area redbook

cassidy singles

divorced women seeking men

adventure vacations for singles

tg personals

dating service alternative

com adult

sex search engine

www beast dating com

affairs video

international free dating site

christian singles mingle

free african american dating sites

singles bristol

sm date

dating australian women

singles saudi

devon singles

ebony sex web sites

dating fat

chating dating

philadelphia personals

so cal swing clubs

singles circle

amtch com

london dating service

www galaxysingles

live web cameras free

flirt date

man gay sex

singles cape town

audlt com

singles friends

sexcam chat

arabic sex web site

sngles

horney matches

secret friends sex chat

indian swinging

www personals yahoo com

100 free online dating websites

adultmatch

sex talk chat

top uk dating

www chat room com

sex chat nz

rancho cucamonga singles

dating software

singles events belfast

18 singles

femdom vid

sexfinder

gay date com

qld singles

singles chatline

personal massagers

catholic singles seattle

dating and chating sites

chat with live girls

sex site search

women seeking affair

websites adult

100 freedatingsite

Achieving Good Performance Is More Than Tuning Queries and Indexes

You may have heard the phrase “Pay no attention to the man behind the curtain” which is a line from the movie The Wizard Of Oz. This is the message many infrastructure teams like to sell to DBAs – “Pay no attention to the hardware behind the curtain”. To a certain extent, this is what we like to hear true since the DBAs have enough to focus on in their daily tasks. Blind faith can lead to problems, since the performance of SQL Server can be negatively impacted by poorly selected and configured components behind the scenes. To combat this, a DBA should know how each component works together to form the configuration of the infrastructure behind SQL Server, as well as the details and little minutiae of the server configuration.

Think of your environment and see if you could answer the following questions (many of which I will ask as part of an assessment or onsite engagement):

  • Are the servers virtualized?  If they are, are any resources over committed?
  • How many SQL Servers are sharing the same physical chassis?
  • How many network hops does it take to get from the SQL Server to the application servers?
  • What uplink speeds are on the switches between your SQL Server and the application servers?
  • How much cache is in the SAN?
  • What tier storage do your data and log files reside on?

If you can answer some or all, you’re ahead of most DBAs out there. Unfortunately, in my experience working directly with customers over the years, the DBAs are in the position of not knowing any of this stuff. But where do you start to find these answers, let alone understand the infrastructure concepts behind them?

Develop a relationship with each team that supports the infrastructure underneath SQL Server, such as those responsible for storage.  Have each team provide you a high level view of your environment and ask questions. Over time, your knowledge will grow and the design decisions you make based on that knowledge will start to have a positive effect in your environment.

These relationships are important both short and long term. Short term, it will come in handy when an issue occurs and you need to reach out to the other teams for help while troubleshooting.  Long term, they can be leveraged when you are designing the future state architecture for a new mission critical application. By then, there will be a healthy working relationship between the teams, yielding a much better design.

Here is an example of why understanding the plumbing behind your SQL Server matters and how it can affect something as simple as your daily backups.  Many of the physical servers with SQL Server installed are often rack mounted. There can be 6 – 8 servers in one chassis. Those servers share common components like network and power. Why does that matter?

Now think about your backup strategy. It’s in our nature to typically schedule all of the full database backups for a single instance at the same time each night. A typical database backup is limited by several factors, but for this example, assume an average of 75MB/sec can be written to the destination, which in this case is a network-based backup solution. If there are 8 servers on a single chassis and you are backing up across the network at the same time for each physical server (all have SQL Server installed), the chassis and network segments need to be able to handle a total of 75 * 8, or 600MB/sec.

A 1 GB network yields a maximum of 122MB/sec ((1,000,000 Gbps/8 bits)/1024 = 122MB/sec), so you do not have enough throughput to get the job done. Achieving 600 MB/sec means you need a 10 GB network.to achieve the performance you require to support the all of the backup processes running concurrently. These bandwidth requirements do not include any other network needs for the applications or other traffic on the network at the same time, so you may also need even more network throughput. I bet you never thought of your backup performance from that angle!

Since many of you now run some or all of your SQL Server deployments in virtual machines, a similar concept happens. Your virtualization admins could deploy a ton of VMs on the same hardware, creating a situation where one or more of the SQL Server VMs you are responsible for is competing with any number of other VMs for memory, network, and I/O since they share the resources of the hypervisor. This means that you will have to not only understand what is going on from a performance standpoint in the VM, but also see what is happening at the hypervisor level, too. That includes knowing how many VMs are running.

I hope this post inspires you to now want to look behind the curtain and see what is back there. Ask the questions from above as well as others you come up with above. The answers will allow you to be better informed and make smarter decisions in every aspect of your role as a DBA.

Will SQL Server 2012′s Availability Groups Replace Clustered Instances?

Happy Monday, everyone!

SQL Server 2012 is right around the corner, and it’s a release that is right up my alley in the mission critical space with features like availability groups (AGs) as well as Windows Server Core support. I’m looking forward to its RTM and subsequent release into the world.

However, I keep getting one question over and over from various people so I thought I’d address it to a wider audience. The question is a simple one: Will AGs replace or do away with the need for a clustered instance of SQL Server? If you want the short answer, it’s a resounding NO, but read on if you want to know why I’m saying that. For the record, if you were in doubt, both are branded under the designation marketing gave to the mission critical features of SQL Server 2012: AlwaysOn (with no space). Make no mistake about it: they are two different features with a single marketing designation.

First and foremost, let me say this: because this is going to be a quick, down-and-dirty post, there will be no pictures. I’m breaking one of my rules here, because this topic could use one (or more). Hopefully without the pictures all of this will still make sense.

Availability groups and clustered instances of SQL Server (also known as FCIs) have one thing in common: Windows failover clustering. AGs use Windows failover clustering (WSFC) as an underlying platform for things like the listener and to provide quorum (like the optional Witness configured on an instance did for database mirroring [DBM]). It makes sense – use a standard, well-proven process like WSFC and not reinvent the wheel. The cool thing about availability groups is that you do not have to cluster the instances; you can use standalone instances on servers that are clustered (those servers are also known as nodes).

[Edie and side note - This is the one aspect that seems to confuse folks with AGs: requiring WSFC but not needing an FCI. I hope you now understand why.]

Here’s the main difference: FCIs protect an entire instance of SQL Server, while an AG – just like log shipping or DBM – only protects at the database level. AGs are much better than DBM in terms of what they offer (example: readable replicas), but you still need to worry about things like logins at the instance level, SQL Server Agent jobs, or any other object that is not in the database. With SQL Server 2012 you could use contained databases to solve the login issue, but that won’t be an option for everyone. You need to think about both levels of protection, and there are a ton of other factors which would influence how a final solution is built. I do not want you to think this is going to be a black and white decision.

The bottom line: it depends on what your goals are for availability and/or disaster recovery. One nice thing about FCIs and AGs is that they play well together, unlike FCIs and DBM. That latter combination, depending on what you wanted to do, could work well together but it was never 100% ideal. There are certainly some caveats to the FCI/AG combination, but that’s a whole separate topic I’m not going to even touch here. I don’t want to give you the wrong idea – combining isn’t bad nor impossible, but like anything else, you do need to plan properly to have them work optimally.

I hope this clarifies things. If not, let me know in the comments and I’ll address any questions there or in another blog post.

Stop Automatically Updating Production Servers

 

Everything old is new again – well, not really. Many of the core concepts I teach or talk about be it in a session or onsite as part of knowledge transfer on a consulting engagement haven’t changed over the years. Implementations may change, but ultimately, the things that drive them do not. If you care about such arcane concepts like uptime, availability, and mission critical, you understand that downtime is bad. Very bad. Yet I still see and talk to quite a few customers who like to cause themselves unnecessary pain. I mean, it’s perfectly legal here in the USA to own a gun and shoot yourself in the foot, but why would you? It makes no sense.

Move Forward, Not Backward

A big part of staying mission critical is patching your servers – be it an application, the OS, hardware drivers, BIOS … you get the idea. You want to stay supported and up to date. The farther behind you are, the tougher it is to catch up or make the next major upgrade (say, SQL Server 2008 to SQL Server 2012) because you are not at the right minimum level. But that’s not the focus here – the focus is on those more regular patches that get installed such as security updates, browser updates, and so on. Those are the thorns in the proverbial rose bush.

Most companies have outage windows to apply updates and perform other maintenance to servers regardless of OS and other choices. That makes sense and is definitely needed – 100% uptime is largely a fallacy. (Sorry to burst your bubble if you really believed it was possible.) But whether it’s a security patch or a “routine”, more infrequent service pack, care must be taken.

Summertime … And The Updates Are Easy

George Gershwin is probably rolling in his grave, but I parody with the deepest respect. Microsoft has made it painfully easy (emphasis on painful) to update your servers. If you’re connected to the interwebz, you can set the handy, dandy Windows Update feature to download updates automatically. Even better – you can even have Windows automatically install them for you, too. What’s next? A tiled interface that makes Windows harder to use and makes your Server look like Windows Phone? Nah, something that silly could never never happen in a million years. Oh wait …

All kidding and sarcasm aside, look at your production servers: are they set to automatically update? If so, you are NOT mission critical. You may think you are, but you’re not. You’ll never convince me otherwise. Why?

  1. Mission critical is a mindset. Part of that mindset is caring about things like SLAs, RTOs, and RPOs. Allowing servers to spontaneously download and update themselves – possibly requiring a reboot or worse causing one without you even asking – is 180 degrees from that mindset.  You might as well just remove any redundancy and leave your chances to fate, who is a cruel mistress. You NEVER want to see this should you have to log into a production server:
    Restart prompt
  2. Updating for the sake of updating is never the right approach. File under “just because you can, doesn’t mean you should”. Key tenets of mission critical are predictability, consistency, and stability. Servers, especially those participating in a clustered configuration (Windows and/or SQL Server), should be evaluated before applying an update to see if they actually need them. Some do NOT apply, and others may be specific to a configuration (such as a cluster). All patches should be tested prior to going on a production box, and some should be given more weight than others. For example, a fix that touches, say, TCP/IP would be one I would test the heck out of since servers don’t rely on networking or anything. :)
  3. To add to #2, I’m going to stick on the consistency point for a moment: servers, especially ones considered mission critical, should all be configured the same. You don’t want varying patch levels and such. That spells trouble with a capital T. (Thank you, Meredith Wilson … I’m on a musical roll tonight!)
  4. If you are going to do anything automatic, just download the updates but for heaven’s sake, don’t let the server automatically apply them. That would be a huge step forward.

The Bottom Line

Look, I realize many of you struggle on a daily basis with your servers and doing this one thing – automatic updates – seemingly makes your lives easier by taking a big task off your proverbial plate. In reality, it may be making it harder. And I also know that in smaller shops especially, you don’t always have the time or (wo)manpower to evaluate each patch/update. I get it. There’s a happy medium somewhere. Don’t turn short term gain into long term pain. There’s no easy or right answer here per se. You have to update. Do what’s best for you, but realize you may blow anything relating to uptime and reliability in the process.

Upgrades and Slipstreamed Installations

I’m in the process of testing upgrades to SQL Server 2012. In the process, I’m trying different paths. The current one I’m in the middle of is a clustered instance which will go from SQL Server 2005 SP4 –> SQL Server 2008 SP1 –> SQL Server 2008 R2 SP1 CU3 –> SQL Server 2012. The SQL Server 2008 and 2008 R2 installations were slipstreamed.

However, when I went to go upgrade to SQL Server 2008 R2 SP1 CU3 using my slipstreamed media, the upgrade kept failing. When I went to go look at the log files, one log file (SqlBrowser_Cpu32_1) had the culprit:

=== Verbose logging started: 2/13/2012  22:33:26  Build type: SHIP UNICODE 5.00.7601.00  Calling process: C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\SQLServer2008R2\x64\setup100.exe ===
MSI (c) (A8:4C) [22:33:26:965]: Resetting cached policy values
MSI (c) (A8:4C) [22:33:26:965]: Machine policy value ‘Debug’ is 0
MSI (c) (A8:4C) [22:33:26:965]: ******* RunEngine:
******* Product: C:\Slipstream 2K8 R2 SP1 CU3\1033_ENU_LP\x64\setup\sqlbrowser.msi

MSI (s) (34:F0) [22:33:27:012]: MainEngineThread is returning 1639
Info 1639. Invalid command line argument. Consult the Windows Installer SDK for detailed command line help.
;”C:\SLIPSTREAM

It looks like the code, for whatever reason, barfed with the spaces only at this point but was OK with it earlier. I renamed the directory to get rid of the spaces and it worked. I entered a bug on Connect here, so go and vote for it.

Weird Slipstream Observation with SQL Server 2008 R2, SP1, and a CU

Hello all. I was putting together some demos for the private class I am delivering this week and I decided to slipstream SQL Server 2008 R2 with SQL Server 2008 R2 SP1 and SQL Server 2008 R2 SP1 CU3. This is fully supported and the process is straightforward. My installation worked fine, and my clustered instance got installed seemingly with no issues. However, I observed the following:

  • In the detailed log file (detail.txt), it references version 10.50.2789 – which is what I expect.
  • SQL Server thinks it’s at version 10.50.2789 - again, all good.
  • Summary.txt does not match – it has version 10.51.2789

See the pic below:

Completed SQL Server 2008 R2 + SP1 + SP1 CU3 install

Completed SQL Server 2008 R2 + SP1 + SP1 CU3 install

Has anyone else seen this? I’m not really woried since the engine is reporting the right number and detail.txt is fine. It just doesn’t make sense that one place would have a slightly different version number.

Also, PCUSOURCE and CUSOURCE didn’t show up on the Ready to Install dialog like I’ve seen. Again, no big deal but just found it interesting.

Just Because You Don’t Like The Answer Doesn’t Mean It’s Wrong

Many people, including myself, spend our own time answering questions on things like the #sqlhelp hash tag on Twitter and on various forums (TechNet/MSDN, SQL Server Central, etc.). This even happens during talks in Q&A sessions. Most people are nice and gracious, but every now and then, you have someone who just keeps chipping away at the stone to hopefully get the answer they really want to hear (even though it’s not possible). 10 observations:

  1. One thing I see quite often is the lashback when a correct answer is provided but yet the arguments still continue. Here’s the deal: some technologies – clusters are a good example – work the way they work. I didn’t design the feature. For example, the requirement for shared storage for a clustered instance of SQL Server up through SQL Server 2008 R2 (this changes a little in 2012, but only in certain scenarios). You can’t get a Windows failover cluster or a SQL Server clustered instance to work any other way potentially without either unsupported (by MS) hacks or not clustering at all. Don’t hate us if you may have not thought out all aspects of your design, have run into an issue, and don’t like what we have to say. We’re really not trying to egg you on or irk you – we’re just giving facts.
  2. Don’t expect us to design your massively complex solution for you for free based on 140 characters or a blog post in a free forum. That’s consulting and what most of us do for our day jobs. We’ve got bills to pay. We can point you in the right direction, but hopefully you’re going to base your design on more than a few line response. Those of us who answer questions love to give back, but realize that there’s always going to be some caveats to what you get for free in a smaller response.
  3. We can’t make you experts in 140 characters or a series of replies. Hopefully you’re not expecting that.
  4. Help us help you. This can’t be said enough. The more and accurate info we have, the better. I go through this with my clients all the time – it’s hard to make decisions about tomorrow when you don’t know what is going on today, what happened last week, or even last year. You need historical data to be able to make informed decisions.
  5. You’re probably wasting your time if you ask something like “I need more disk performancebut I’ve got no budget. Will 5 SATA disks work?” You pretty much have already answered your own question. You can do things right, you can do things fast, or you can do things cheap. Those three things generally don’t all work together. Sometimes you can do them right and fast, but never cheap. Fast and cheap are often lumped together, but often leads to never being right. I think you get the point. There’s a tradeoff somewhere.
  6. Capacity is not just size (memory, storage, etc.). There’s a whole performance aspect you need to consider. That’s why we ask questions like “How many I/Ops do you need?”
  7. Your workload matters. Know what it is and be able to hopefully articulate that. What advice works for one implementation may be totally wrong for another.
  8. RT(F)M. Before asking us a question, do some simple homework first. A simple BinGoogle search may yield your answer or point you to a BOL topic to allow you to ask a more directed question. That’s where we’ll provide you with (hopefully) some value.
  9. Do NOT post any confidential info about your company, app, or data. This may include things like schema, object names, etc. Common sense, but is it really? Find a way to express your question while still keeping sensitive things under wraps. Easier said than done, of course. This is why I put NDAs in place with my customers when we work together. It allows me to do work and protects both them and me.
  10. I subscribe to the motto that there are no stupid questions, and I truly believe that. Heck, we all started somewhere. It’s one reason I give back to the community. I didn’t get here on sheer will alone - I’ve had help along the way. But really look at #8 above – many questions can be answered that way. What is not easy, though, is trusting the information source it may come from. I’ve seen some sketchy advice put out there (no, I won’t say where), and it’s a free Internet. People can post what they want based on their experiences – just doesn’t make it technically sound advice. OK, I lied – one example which involves shrinking. We all know how shrinking is not a normal best practice. Don’t believe me? Just read some of Paul Randal’s thoughts. Point is, that first linked post gave OK technical instructions, but not necessarily the most sound advice. Learn to differentiate the two. In the case of that original link, I bet the DBA smartly sized the DB for longer term growth. We’ll never know.

Sorry if this sounds cranky, but I see this stuff over and over. We really do want to help you. Just don’t shoot the messenger if you don’t like the message.

Always Move Forward

Can you believe it’s February already? 2012 is flying by! I’m thankful that I’m crazy busy, and the partnership with Ben is going well (knock on wood). I’m really looking forward to the training class in March and psyched to have another Training Day at SQLBits X (have you registered yet? No? Go do it!). It’s been a fun travel year already. With one trip to Seattle and California under my belt already, I’ve got quite a few trips between now and April (including across the Atlantic to SQLBits in March) with a few more pending.At this rate, I’ll hit 50,000 miles in the air by mid-year. I know, first world problems – cry your crocodile tears for me.

Quite frankly, since late August when my friend Mike passed away (see Life Is Fragile and Fitting Tribute to a Hero for more info), things have been a blur. Last week (January 25 to be exact) would have been his 40th birthday. No wonder it’s been a somewhat “meh” couple of weeks for me. Considering I just had my 40th at the end of November, it’s all still very visceral for me. Little things remind me of him and probably always will. Rest in peace, my friend.

I also blogged a little about my substandard PASS scores. And no, they were not in the 2s, but certainly poor. To be perfectly honest, since I briefly looked at them back in December, I can’t bring myself to really look at them again. I remember some of the comments. Some were spot on. Looking back at the possible reasons for failure (at least in my eyes), like I said in that blog post – any problems fall squarely on my shoulders. I can’t blame anyone else, but I could certainly use some excuses such as:

  • I used a build that was not very old and I had just gotten clearance a few days before to use – and right before the holidays so I didn’t have much time to work with it. I knew the risk, and I bit the dirt on that one.
  • In retrospect, the death of my friend really impacted me more than I could have ever imagined. Even with killer material and solid demos, I probably still would have had a sub-par PASS by any stretch of the imagination. Again, not an excuse, but looking back I can see where that really weighed on me.
  • I was a bit far ahead of the curve. Since it’s not a stretch to say that SQL Server 2012 will RTM soon now that we’re at the Release Candidate and full adoption for many customers is months, if not a year or more away, doing all SQL Server 2012 material that early was a big risk.

So I went big, took risks, and failed. That’s OK – it’s a good learning exercise and I’d probably do it again. Why? Well, I didn’t get where I am today without taking some big leaps of faith in myself (and sometimes those around me) along with some risks along the way. Sometimes you win, sometimes you lose. You can learn from both. If you never experience failure or disappointment (real disappointment, not the “Mommy didn’t buy me the stuffed animal at the toy store” kind), I’m not sure you can ever understand what success looks like. At least that’s how I feel on the matter.

Despite the fact I have done this for years, my PASS scores showed I may need to brush up on presenting, so I’m actually looking to improve my presentation skills. I’ve done a few things already (including getting a few books) that I hope will show some positive results.  Last week at the SQL Bare Metal training in Redmond, I got up and did an impromptu 20 minute presentation that felt really good. No deck, just demos; and no preparation. That was a tough audience – all MCMs, MVPs, and other really smart folks who know SQL well. Easy to bomb there. I didn’t. Maybe it is working. You can let me know when you see me present at SQL Saturday in Mountain View or at SQLBits, and even the training class in March.

To be honest, even if I never score in the top 5 or 10 at a conference, I get more satisfaction from people that come up to me – be it 1 day or 2 years later – saying they took something from a talk I did and it helped them. Those are measurable results numbers won’t ever show.

The journey matters, not just the destination. A score is just a number. Having a positive impact on someone is much better than a score in the long term.

 

The Double Secret Handshake

It’s common to hear on the news “A source close to <insert government body or celebrity here> has informed us that …” – basically they’re leaking information they probably shouldn’t and may even legally be bound not to talk about. Sometimes there’s no fallout, other times it’s pretty bad. Well, the same thing happens in our world of computers. One example is a major event like CES (which happens to be this week, actually) – you see some leaks of products before any official announcements. We all love the super-secret spy shots of some as-yet-to-be-announced gadget. We then speculate what the buttons do. That’s no better or worse, but a lot of fun. Sites like Engadget are often about that type of speculation. Companies try to keep things under wraps as to keep some sort of competitive advantage – seems like that’s almost impossible.

Recently there was a discussion some of us were having around things like build numbers for SQL Server and release dates. Let’s face it, if you enter a bug in Connect for SQL Server, you see a bunch of releases both old and new in the dropdown to match up with the version that you had the problem with. Some of those builds or versions you may have never seen or heard of, so the fact there are interim builds you don’t have access to is one of the worst open secrets out there. For example, in the case of SQL Server 2012, a build number may be later than CTP3 but before RC0. That build is most likely associated with a one-off for a specific, small set of customers. An example of that is, say, an early adopter program (commonly known as TAP). These builds are not for general consumption and you hopefully won’t find people blogging or talking about them. Why? They’re under pretty strict NDA agreements not to talk about this stuff in any kind of public way, and there are consequences for spilling any beans.

TAP is not the same as MVP and covered by different NDAs as well as other rules. As someone who is in the MVP program and has been in various TAPs over the years, you have to be careful about who you speak to about what. Some folks are in that intersection where you are in both programs, some just in the TAP, and some MVP. Where’s a venn diagram when you need one? You can’t assume that what one knows the other does as well, or that you can share just because it’s Microsoft and maybe even the same product. There’s something to be said for personal accountability. You shouldn’t even need the threat of possible legal actions (covered by the NDA) to keep what you shouldn’t be talking about quiet. Common sense has to prevail.

While it’s true there are folks outside of the proverbial four Microsoft walls that may know what’s coming (be it 1 week or 1 year from now), no one who is privy to such information should be leaking it. Getting into the MVP or a TAP is a privilege, not a right. It can be ripped out from you. So if you know people who are participating, you shouldn’t be getting any information from them that they’re bound not to talk about. It’s not meant to slight anyone. I’m sure you can think of times in your own lives – be it personal or professional – you didn’t want certain people to know things. For example, if you’re looking for a job, do you really want your current employer to know? Probably not, although I do know some people who would be open about that. The majority of us keep that on the down low until your two week notice is given.

If for some reason you do somehow get access to such data or information or someone opens their big mouth, think before you go mentioning it to the world at large. There are ripple effects. In the immortal words of Sgt. Schultz from Hogan’s Heroes …

 

Once More With Feeling: Stop Using Active/Passive and Active/Active

Happy 2012, everyone! I figured I might as well kick of the new year by beating a dead horse (not literally – don’t call PETA; I’m not an animal abuser). If you’ve ever heard me speak, read my books or whitepapers, seen a tweet, a blog post, or heck, know of me, you know there’s nothing more that gets my dander up than the use of active/passive (A/P) and active/active (A/A) as it relates to SQL Server failover clusters. This will not be my first time ranting, nor will it be my last. I ranted in this forum post over at SQL Server Central last May as part of a response to something I felt added to the confusion of why people still cling to this silly terminology today. Expect yet another diatribe in my upcoming SQL Server 2012 book, too (although if I like what I write here, I’ll just link this blog post). Anyway …

You may be asking yourself, “Why does he care?” The answer is not as simple as just using wrong terminology.

A Bit of History

Let’s take a way back machine for a moment. July of this year will mark 10 years since the publication of the SQL Server 2000 failover clustering whitepaper I wrote while at Microsoft. That paper has a lot to do with my career today, and I am forever indebted to MS for publishing it. That paper also features the first use of non-A/P and A/A terminology. The terminology is really simple and SQL Server specific. That was a main driver in doing it. You’ll see why as you read on.

SQL Server 2000 marked the biggest change in concepts with failover clustering with the introduction of instances. I think we take it for granted now, but 10+ years ago, it was a pretty big deal to have more than one SQL on a single box or a cluster. There were very practical issues which made having multiple instances of SQL Server 2000 difficult, namely how scalable both Windows and its underlying hardware were. Single core processors, poor hyperthreading support, not much memory (8GB was a luxury for most), needing AWE to go beyond 4GB of memory, 32-bit (discounting the very limited Itanium variant which was released near the end of its lifecycle), DAS, no mount points – shall I go on?

Now let’s back up to the previous release – SQL Server 7.0. SQL Server 7.0 allowed you to cluster up to 2 separate installations (not instances) on two nodes of a Windows failover cluster (I won’t touch that terminology in this post … people still uses MSCS which is wrong, too). Very few of you reading this post may have ever used SQL Server 7.0, let alone clustered it, but let me tell you it was a miserable experience and anyone complaining about things today (and even I do) should put things in perspective. Yes, failover clustering today still has a few things I’d love to see addressed (mainly in Setup), but I’ll take what we have now any day of the week that was the misery known as SQL Server 7.0. Why?

  1. SQL Server 7.0 forced you to put the binaries for SQL Server on the shared disk – meaning you only had one copy and you could only run the management tools from the node which did the installation.
  2. To apply a SQL Server service pack, you had to uncluster the SQL Server installation, run the update (such as SP2), and then re-cluster the install. Let’s just say that didn’t always work.
  3. The clustering implementation was a bit of a hack prior to 2000. As part of the clustering process for SQL, MS replaced some DLLs to get SQL to work in a cluster. This is why specific versions of MDAC and such were required, and if you updated them on your own, you had the potential to break your cluster with 7.0. Fun stuff that I’m glad went away with the rearchitecture of failover clustering in SQL Server 2000. The MDAC issue had a lot to do with bad perceptions of SQL Server (in general – not just clustering) as well. Those issues were fixed a long time ago – notice you never hear about MDAC and issues related to it much, if at all, any more :)

So you can see why the ability to easily have more than one clustered installation of SQL Server on the same Windows failover cluster was so appealing when Microsoft fixed this in SQL Server 2000. SQL Server 7.0′s failover clustering implementation burned a lot of people, and unfortunately helped create part of the perception that failover clustering is bad. I know a lot of folks had a hand in that, including Richard Waymire (Blog | Twitter)  who was the PM for the feature at the time.

SQL Server 2000 didn’t do clustering any favors in helping out in the perception department, either if I want to be a bit honest all these years on. With the aforementioned limitations (especially memory), you were really limited as to how many instances you could configure on a single Windows failover cluster. People had some really bad experiences with failovers. That’s why to this day I believe you still see a lot of single instance deployments on two-node Windows failover clusters – somewhere along the line they had something bad happen when more than one instance tried to run on a single node. This leaves us in a fun conundrum with other wrong perceptions, such as so-called “wasted resources” but that’s a topic for another time, too. The failover condition wasn’t the only thing that contributed to the poor perception of clustering with 2000 (the other bigger issue tended to be around the occasional file not being laid down during an update or on a remote install – not all customers encountered it which was what made it frustrating), but the entire experience was greatly improved.

In a long-winded way, this is how both “single instance (SQL Server) failover cluster” or “multiple instance (SQL Server) failover cluster” came about. This is more important than ever since Windows Server 2008 and beyond also uses the terminology failover cluster, which used to be a SQL Server only one.

Looking back at SQL Server 7.0, A/P and A/A made sense – even I’ll admit that. You could only have up to two nodes with Windows NT 4.0 Server (the predominant OS used around that time; SQL 7.0 was not developed with Windows 2000 Server in mind), and up to two installations. One instance? A/P, two instances? A/A. In that context, I have no real issue. But once the concept of instances was introduced with SQL Server 2000, all of that should have gone out the window. Now, sure, technically you could have two nodes with two instances and have it “technically” meet the A/P and A/A criteria. I get it. But it’s wrong now, and has been for 10+ years. I can’t be any more clear on this.

Let’s look at scenarios to drive the point home further.

Two Instances, Two Nodes

OK, so this is the “classic”. In a perfect world, you’ve got two nodes, each with its own instance. Since both nodes are running an instance, they are both actively hosting a SQL Server instance. In that case, it’s easy to buy into A/A if you wanted to. What happens – assuming capacity is not an issue – if there is a failover? Technically you are no longer in an A/A configuration. One node is hosting both instances. The other is down or not running any SQL Server instances. Is that now AA/P? No. This where using the right terminology – a multiple instance SQL Server failover cluster – is much more descriptive and actually makes sense.

Three Instances, Two Nodes

This one is a riff on the previous scenario. This one is a simple math problem: is it AA/A? A/AA? AAA/P? It certainly isn’t A/A/A, nor A/A/A/P. Again, the multiple instance SQL Server failover cluster terminology works shockingly well here. (“I’ve got a three instance SQL Server failover cluster.”)

Three Instances, Three Nodes

This one is similar to the Two Instances, Two Nodes scenario. If you have one instance (perfect world: no failovers) per cluster node, to some that would be A/A/A. But how many slashes do you get before things get out of control? And then what happens if failover happens (see previous scenarios for how that will go).

Two Instances, Three Nodes

Here’s another one that I see wrong, so I can kill two birds with one stone here. When Windows gained the ability to have more than two nodes in a failover cluster, the idea of a dedicated failover node – especially when you had multiple instances – was introduced. Because the limitation was at most four nodes with the  of Windows 2000 Server Datacenter edition, this was often known as the N+1 scenario. When you really could have more than four nodes and more than one dedicated failover (not getting into specific configurations and things like preferred or possible owners in this post, either), this is N+i, where i would be the number of nodes that are failover ones.

From a SQL Server point of view, how would you denote the N+1 scenario? A/A/P? That’s just silly. Again, saying something like “I have a two instance SQL Server failover cluster with a dedicated failover node” makes a whole lot more sense.

Many Instances, More Than Two Nodes

Here’s where I throw up my hands and if I could curse, I probably would. If you plan on having, say, six instances on however many nodes (let’s pick four for the sake of argument), A/A/A/A/A/A (denoting the number of instances), nor is it really A/A/A/A (denoting the number of nodes). It sure as heck isn’t A/P or A/A. Capiche?

Doesn’t Microsoft Use Both the A/P and A/A Terms Still?

Unfortunately, they have and some still do to this day. I’ve seen it in more current documentation and blog posts from them. Sigh. It doesn’t mean they are right, though. We tried with the release of SQL Server 2000 and all of the corresponding documentation (including my SQL Server 2000 High Availability book) to make things consistent. The same could be said for 2005. But A/P and A/A seem to be sticking around like a cockroach will long after humans after a nuclear blast. Not everyone studies their history, so I guess we’re doomed to repeat it.

In Conclusion

Stop the insanity!

OK, maybe that’s a bit overblown but I wanted to work that reference/kitsch in somewhere. Who wasn’t entertained by that infomercial back in the day?

Like I said in the forum post I link above, I won’t publicly shame anyone for using A/P or A/A, and I may not even correct or embarass you (I may … my apologies if I do), but I hope I’ve explained to you why they’re wrong. Honestly, I don’t get why people are not embracing the right terminology as they accurately describe the configuration. I would even venture to guess that there would be less confusion around more complex clustered SQL Server configurations if people talked about them properly.

 

Think Setup Is Bad? Try Removal!

Last week, I wrote a blog post entitled “It’s Just Setup – Who Cares?”. What I didn’t do was go into the other side of that equation: uninstalling/removing an application. As bad as some software installers can be, trying to remove that software can sometimes be 100x more painful. I’m sure you’ve experienced this either on your own personal machines (regardless of OS) or your enterprise servers at some point in your lifetime.

To be honest, this annoys me much more than a bad installation process. In my mind, any piece of software that gets put onto your computer should be able to be removed completely with none of its traces or hooks still around. I think that’s not an unreasonable expectation … or is it? I mean, I’ve seen numerous times where the odd folder/directory, configuration file, or set of registry entries is still hanging around like that bad party guest who doesn’t want to leave at the end of the end of the night.

Oh, and don’t get me started on programs that will require a reboot to complete the software removal. I have never seen an uninstaller tell me that little fact when the process begins, and I love one of these two situations more:

  1. At the end, a dialog pops up and says something like “Your computer must be rebooted to complete this process” with an OK button. You click, computer reboots.
  2. No dialog, just reboot with no care as to what else may be going on. Working on a document you haven’t saved? Too bad, so sad.

Do you want either of these situations on your computer – especially a mission critical server? Heck no! If whatever you’re trying to eradicate from your computer or server does require a reboot, at least give me the option of when I choose to do the reboot. Again, I don’t think this is a completely unreasonable request of any software vendor. I am not thrilled I’m left in a pending state, but at least I control my own destiny.

Both of these blog posts underscore one thing that holds true: pick your software carefully and test things like install and uninstall before you go live. Knowing is better than flying blind, especially in a production environment that has very tight SLAs for uptime. That 128GB server isn’t coming back online in 30 seconds after a reboot.

« Previous PageNext Page »