13 Apr

Master Data Services: Installation

I’m honestly not going to spend a lot of time on MDS but I am going to fire a few posts at it.  My hope, and the chance that I’m going to take, is that I won’t get hit with any hard-core configuration issues.  If the test in some fashion asks me to do something step-by-step I’ll just have to hope I get lucky.  My hope is to get it to the point that I can get some, maybe most, questions on it, right.

So, onto installation.  Be careful here, there are a lot of steps to it.

  1. Run SQL Server Setup.

And that’s that.  However, there are a number of requirements to use it.

  • Microsoft Silverlight 5 is required.  The application will prompt you to install it.
  • It only runs on 64-bit editions of SQL Server 2012 Developer, Business Intelligence Edition and Enterprise.
  • You must install IIS 7, 8 or 9.
    • This requires a handful of things be configured.  You can see the full list here:
      http://msdn.microsoft.com/en-us/library/ee633744.aspx
    • If you are using Windows 7, which I do most of the time, you cannot install Windows Authentication, which means you cannot install MDS.  You will need to either upgrade to  Windows 7 Ultimate, use a virtual server, or find a SQL Server installation on Microsoft Server.
  • .NET 3.51 or greater.

The MDS add-in for Excel is installed separately.  You download it from Microsoft.  There are two versions: 32-bit and 64-bit.  Unfortunately, I have not tested this.  My virtual server doesn’t have Excel and the system that does, my desktop, is only Windows 7.

Finally, you can integrate Data Quality Services with MDS but you have to install it, and the Data Quality Client, on the same instance of SQL Server that you have MDS installed on.

01 Apr

Data Quality Services: What It Does

For some reason I have had a hard time getting my head wrapped around what this product does and how it does it.  I just did some research on the topic and it amazes me how many articles out there that say “the first step is creating a knowledge base, and, here’s how you do it”.  Far too many articles jump into the “how” but completely neglect the “what it does” piece.

The Big 3.

The Big 3.

What I’m going to do is break down the DQS sections and explain them in general terms.  I’m certain that I’ll see at least one generic type of question where DQS is one of a list of potential solutions to a problem and I want to get at least that much right about the product.

As such, this is a 10,000 foot version post.  I’m not going step-by-step through a process or into any real detail on the product.  All of those other websites have already done that.  I just want to nail down what each section does in a general sense because I simply couldn’t find this information in a single place.

DQS, consists of three things: Knowledge Base Management, Data Quality Projects, and Administration.  In more detail they do the following things.

Knowledge Base Management

A knowledge base is a collection of metadata that you use for cleansing, matching or profiling.  It is, roughly, equivalent to a table schema, except that it’s more involved than that.   But, as far as I’m concerned, conceptually, that works.

Domain Management.

Domain Management.

To create a knowledge base you must first create a domains.  Domains are attributes in a very broad and very loose sense.  Specifically, there are 5 parts to the demain management section of DQS.

  • Domain properties: Attributes such as name, definition, format, etc.
  • Reference data: This is where you would add external data sources.  For instance, a list of valid zip codes could probably be obtained from a 3rd party data resource here.
  • Domain Rules: Conditions that you can apply to a domain.  A valid zip code must have at least 5 letters is a condition that you could apply to a domain.
  • Domain Values: If you have a data source that you are working with, you can view it here and confirm or clean it up.  So, if the potential data in your domain consists of 94520 and 94522, this is where you would adjust 94522 to 94523.
  • Term-Based Relations: You can create a list of values that map to another value.  Biz, Bus. and Bus could all be mapped to the word Business which could help to clean up your data.

So, a knowledge base consists, more or less, of rules used to restrict and clean up your data.

Data Quality Projects

Managing the results of a cleanse.

Managing the results of a cleanse.

In this section you take a data source and apply the knowledge base you created to it.  A data project consists of the following:

  • Map: You map your target data to an existing knowledge base.  It can be in a Excel file or SQL Server database.
  • Cleanse: Where you process the previous mapping step.
  • Manage and view results: This is where you review the results of the cleansing process.  You can correct and make adjustments to the results.
  • Export: You can export your results to SQL Server, Excel or a CSV file.

Data quality projects are how you process your data through a knowledge base.

Administration

This part has two sections: Activity Monitoring and Configuration.

Activity Monitoring is a list of all activities performed in DQS.  This includes things like domain management, creating a knowledge base and cleansing data.  Right-clicking gives you additional options, including the ability to terminate a running process.  The Profiler tab at the bottom provides more information on each specific activity.

Activity Monitoring.

Activity Monitoring.

Configuration consists of 3 sections:

  • Reference Data: Manage access to third-party data providers.
  • General Settings: Some, uh, general settings for the program with help buttons that didn’t work when I clicked on them.  I’m guessing my default browser isn’t supposed to be Firefox, or, maybe they just don’t work.
  • Log Settings: Set logging options, specifically when to trigger logging.  You can drill it down to specific modules.

I hope this helps someone.  Personally, I really struggled with getting this clear in my head and writing this has definitely helped.

27 Mar

Data Quality Services Installation

On the MeasureUp practice tests I keep getting asked about DQS installation, and, I keep failing the questions. I’m deliberately not memorizing answers, and I’m not overly fond of working with this product, so that’s to be expected. I guess it’s time to stop screwing those up.

Data Quality Services is a three-step installation process.

  1. Install the Data Quality server during SQL Server installation.
  2. Install the Data Quality client during SQL Server installation.
  3. Run the DQSInstaller.

You would think I could handle that.  Anyways, the installer does some additional things.

It creates the following databases.

  • DQS_MAIN: The guts of DQS.  It includes the engine, knowledge bases and stored procedures.
  • DQS_PROJECTS: The data for DQS projects
  • DQS_STAGING_DATA: Where you store staging data for projects.

It creates the following roles in DQS_MAIN.

  • dqs_administrator
  • dqs_kb_editor
  • dqs_kb_operator role

I’ll detail those in a different post.

It also creates additional logins, the install .log file and the DQInitDQS_MAIN stored procedure.

Finally, if Master Data Services is present, a user is mapped to Master Data Services with the dq_administrator role in the DQS_MAIN database.

You can read more here,

http://msdn.microsoft.com/en-us/library/gg492277.aspx#DQSInstallation

and this guy put some crazy time in on the product.

http://www.bidn.com/blogs/cprice1979/ssas/2482/getting-started-with-data-quality-services-dqs-2012