Alan Whitehouse's Ramblings

Continuing to work until my heavy investment in lottery tickets finally pays off….

  • Categories

  • Archives

  • Deep Thought

    Historically speaking, all true change in the world has come thanks to leaders emerging, them taking charge and giving the masses someone to rally around. Can an intentionally "leaderless" movement survive or will it just slowly fade away?

  • Subscribe

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 18 other followers

  • Me!

  • Disclaimer

    The views expressed in this blog, while intelligent and always right, are strictly my own, and do not necessarily reflect the views of anyone else with which I am in any way affiliated. And don't forget, I own the rights to all information on this blog (except for the stuff I stole from other people).
  • Current Rant

    *START OF RANT*

    You want change, then get involved. Vote, run for office, go to shareholder meetings and contact advertisers or investors. Sitting around banging drums, singing kumbaya, smoking weed and having a camp out under the stars is not going to get you the change you want.

    *END OF RANT*
  • Admin Stuff

Merging Tables with Different Columns and Layouts in PowerPivot

Posted by Alan on March 24, 2011

So I have not really worked in-depth with PowerPivot up to this point.  I have played around with it but never really done anything substantial.  However, I recently embarked on a pilot project for a client where PowerPivot was potentially the right solution for them.  This client’s data resides in multiple systems, which is fairly normal now-a-days.  The kicker for them was that most of their systems are hosted by outside providers and the options to get data out of those systems is limited.  Some of the limitations are forced by the systems themselves while some are financial as the provider wants extra money for every change in the export format or for each export.  So basically I am dealing with data that comes to the client in 8 or 9 spreadsheets, all with different columns and all with different column names.  To top it off, not all the columns are consistent across the spreadsheets.

So my first challenge was how to get these spreadsheets into PowerPivot.  Now PowerPivot can easily bring in tables from Excel, but it is designed to bring in different data that can be logically linked.  My need was to consolidate all these data sources into a single table.  After pulling my hair out on how to do this, my colleague Richard Mintz (he is blogging now, check him out) pointed me to a blog post that helped me figure out what I needed to do.  Now this post assumed all the files were the same so I took his method and modified it somewhat as detailed below:

  1. I created a blank “master” Excel workbook that contained the fields I wanted to populate with “user friendly” names rather than start with one of the data sources first.
  2. I created my data connections to all my data source spreadsheets.
  3. I brought in my master Excel workbook first and then began modifying the import query for the rest of the imports.

It was in step #3 where I began to forge my own way.  So after the “UNION ALL” statement, instead of a “Select * From C:\\file_path\datasource.xlsx.[TabName$]”  statement I modified it to be a “Select As” type statement where I mapped my raw field names to my user friendly names.   An example is below:   

SELECT [‘MasterData$’].*
FROM [‘MasterData$’]
UNION ALL
SELECT [DataSource1$].XX_ID as ‘UniqueID’,
[DataSource1$].TITLE as ‘Salutation’,
[DataSource1$].FNAME as ‘First Name’,
[DataSource1$].LNAME as ‘Last Name’,
” as ‘Title’,
” as ‘Company ID’,
[DataSource1$].COMPANY as ‘Company Name’,
[DataSource1$].ADDR1 as ‘Address 1’,
[DataSource1$].ADDR2 as ‘Address 2’,
[DataSource1$].CITY as ‘City’,
[DataSource1$].PROVINCE as ‘Province’,
[DataSource1$].POSTAL as ‘Postal Code’,
” as ‘Phone’,
[DataSource1$].LAST_CONTACT_DATE as ‘Imported Date’,
‘M/D/Y’ as ‘Imported Date Format’,
[DataSource1$].SALE_AMOUNT as ‘Sale Amount’,
‘DataSource1’ as ‘Source’
FROM C:\file_path\UglyDataSource1.xlsx.[DataSource1$]
UNION ALL (repeat with next data source and modified as necessary)……

 So a couple of things I learned by doing this whole process:

  • It can be done but it is not the easiest or most intuitive thing to do.
  • You have to map to every field in your master document.  If a data source does not have that field, then you have fill it in with a blank or “n/a” or some other value.  This is what happened to me for in the sample above Title, Company ID and Phone.   Each of my queries had to be tweaked individually.
  • If you are bringing in data from multiple sources I found it helpful to add a column to track where the data came from hence the field “Source” that I added.
  • Hand coding the complete query in the PowerPivot window was tedious at best.   I gave up and wrote it in Notepad and then copied it over.
  • My different data sources all treated dates differently and this was very problematic.   However that is the subject of my next post.

If anyone has an easier way to do this I would love to hear about it.

Advertisements

9 Responses to “Merging Tables with Different Columns and Layouts in PowerPivot”

  1. Peter Eb. said

    So you can’t create a query to the original data source? You start with data in the xlsx via a multitude of partners using mechanisms not always including a sql query that you have access to?

  2. Alan said

    Some of the data comes from service providers that organizations who do the work on a contract basis for my client and use their own system. Therefore no direct access to data. The other systems are older and direct access either means the client pays more per month or is not available due to terms of service or technology. It is the downside of using a service bureau for part of your business model — they get to charge you for everything.

  3. denglishbi said

    Notepad? Are you crazy man? You got to go with Notepad++ ASAP. Maybe in Windows 8 there will be a new version in the OS…we can dream right? There is Notepad2, but I would go with Notepad++ first. Lot’s of nice features.

  4. Alan said

    Dan. Thanks for the heads-up on Notepad++. I download it and it looks promising.

  5. Michael M said

    I can’t get this to work with two different access databases, which are identical layout wise. In each DB there’s one table with almost 1 mill. rows of data. Each database contains data from one month.

    I have created one DB connection in powerpivot and got the August data in. Then I try and edit the connection using the Union All statement to get the data from the second DB, looking like this:

    SELECT [tblDataIn August 2011].*
    FROM [tblDataIn August 2011]
    Union ALL
    Select *
    From ‘C:\Users\xxx\Documents7 July 2011.accdb’.[tblDataIn Juli 2011$]

    The above statement generates an error “No colums specified”

    I have tried to specify columns in the Select statement but to no vail. What am I missing? 😦

  6. Alan said

    Try listing out the columns rather than using the ‘ * ‘ in your second select statement and see if that works.

  7. Michael M said

    Thanks for the answer. I tried to list the columns, but coulden’t make it work 🙂

  8. Steveo said

    I am trying the same thing and get no columns detected.

    SELECT [MasterHPSQLData$].* FROM [MasterHPSQLData$]
    UNION ALL
    SELECT [Table001].OSVersion as ‘OS Version”
    FROM c:\Test\Test.xlsx.[Table001$]

  9. Christopher said

    Hi all
    I had encountered the same error using

    SELECT [‘eSilk Y2000-2010$’].* FROM [‘eSilk Y2000-2010$’] WHERE ([Result return] = ‘BINGO’)
    UNION ALL
    SELECT [‘eSilk Y2000-2010$’].* FROM `D:\1. On Going\PowerPivot\PP-Test Cases\Pen Scanning\Sources\My Working\PC5.xlsx`.[‘eSilk Y2000-2010$’] WHERE ([Result return] = ‘BINGO’)

    PP guru any other alternative to merge two same table structures to one view ?

    Thank you

    Christopher

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: