Showing posts with label dimension. Show all posts
Showing posts with label dimension. Show all posts

Thursday, March 29, 2012

Concatenate Dimension Attributes

Hi people,

This might be a really simple one hopefully, but is there a way I can create an attribute that is 2 other attributes concatenated, i.e.

Dim_Employee

EmployeeName

Employee Surname

Can I get EmployeeFullname by concatenating Name & Surname and if so how?

In SSAS2005 you create a new named calculation on the table in the datasource view.

Use TSQL EmployeeName + ',' + [Employee SureName]

In AS2000 you will have to write this TSQL in the name column for the dimension level. You do this in the dimension editor.

HTH

Thomas Ivarsson

|||

Cheers.

Easy when you know how :)

Sunday, March 25, 2012

Computed measures and role-playing dimensions

I've got a database with a single Date dimension that is used as a role-playing dimension in the cube. Measure groups have 2-4 different role relationships with the date dimension (e.g. sell date, ship date).

I've got some calculated members, defined in the style used by the Time Intelligence wizard, roughly:

Scope(
{
[Measures].[Amount],
[Measures].[Count]
}
);

( [Period].[Period].[60 calendar days],
[First Date].[Date].[Date].Members ) =

Aggregate(
{ [Period].[Period].DefaultMember } *
{ [First Date].[Date].CurrentMember.lag(59) : [First Date].[Date].CurrentMember }
);

( [Period].[Period].[60 calendar days],
[Second Date].[Date].[Date].Members ) =

Aggregate(
{ [Period].[Period].DefaultMember } *
{ [Second Date].[Date].CurrentMember.lag(59) : [Second Date].[Date].CurrentMember }
);


End Scope;

Here [First Date] and [Second Date] represent two of my role-playing date dimensions.

What I'm noticing is that this doesn't work: the calculated members are calculated correctly for whichever date dimension is listed last in the script and are apparently erased for whichever one was listed first.

The particular measure group I'm using in my sample query has a relationship with both date roles, and in fact they're identical for this measure group - both are based on the same underlying column in the relational data (that's not true for all measure groups, it just happens to be in this case).

Can someone explain what I'm seeing? Is this a bug, or a subtety that I haven't accounted for? How can I make this work for 2 or more role-playing date dimensions at the same time?

Well, if you will position current coordinate on the [First Date].[Date].[Date] level - then you will see that the first formula is applied correctly. What you probably mean by saying "last formula erases whatever was listed first" is that if you look at the aggregate level - then the aggregation is computed from the last formula. Obviously, at the aggregate level there is a conflict - which formula should be used to aggregate - it could be either first or second. The precendence rules in AS say that the later assignment wins. This is not a bug, after all, single cell cannot be calculated using two different formulas !

Perhaps if you will explain what exact results you would like to see for specific queries - this forum will be able to craft MDX script to solve it.

|||

I neglected to include representative queries, so here's one that corresponds to the names I used in the first post:

select
{
[Measures].[Amount]
} on columns,
{
[Period].[Period].&[60]
} on rows
from
[Database]
where
(
-- [First Date].[Date].&[20061228]
-- [Second Date].[Date].&[20061228]
)

([Period].[Period].&[60] is the same member as [Period].[Period].[60 calendar days])

Given the computations given above, and the fact that [First Date] and [Second Date] actually refer to the same column in the fact table for this measure group, I'd expect that the above query would return the same results with either of the two lines in the where clause uncommented, but that's not the case.

Whichever date dimensions corresponds to the second set of calculations in the script produces the correct result while the other produces a wrong result - I'm not sure exactly what the wrong result represents - it's a larger number than anything I can think up that might make sense (e.g. it's not the aggregation totally ignoring the date dimension in question - it's something else).

|||

Mosha Pasumansky wrote:

Obviously, at the aggregate level there is a conflict - which formula should be used to aggregate - it could be either first or second. The precendence rules in AS say that the later assignment wins. This is not a bug, after all, single cell cannot be calculated using two different formulas !

Why is this obvious? As far as I can see, the calculations are not referencing the same locations in the cube. Please enlighten me! :)

|||

Sorry, I thought it was clear :(

Both calculations apply to the leaf level of their respective Date dimension, i.e. [First Date].[Date].[Date] and [Second Date].[Date].[Date]. So the question is, how the value at ([First Date].[Date].[All Dates], [Second Date].[Date].[All Dates]) should be computed. It can aggregate from either one of lower levels - but depending on which one it will aggregate from, the results will be different, so AS has to choose one or the other.

|||

OK, I can see that there's an ambiguity on those particular members, but I'm not querying those members (or am I?).

So how can I get the results I want?

|||> but I'm not querying those members (or am I?).

Can you please provide the query you are using

> So how can I get the results I want?

Can you please describe what exactly result do you want.

|||I already did - see my second post in this thread.|||I see, that you added the query later, although it is still not clear from your description what is the expected result. But I am going to guess that what you want is either LastChild or LastNonEmpty semiadditive aggregation. If this is true, you need to define your measure as semiadditive with this aggregation function. Note, that you still will only be able either First Date or Second Date, but not both, because there is still a conflict at aggregate level.|||Just to add my 2 cents' worth - based on your comment that "[Period].[Period].&[60] is the same member as [Period].[Period].[60 calendar days]", is this a physical (place-holder) rather than calculated dimension member, to which the trailing-60-day calculation is applied? If so, my guess is that "the fact that [First Date] and [Second Date] actually refer to the same column in the fact table for this measure group" could be significant. If you instead create a [Period].[Period].[60 calendar days] calculated member of the [Period].[Period] hierarchy, do you get the expected results?|||

I'm quite certain that I don't want LastChild or LastNonEmpty (especially since those are EE only features).

What I want is for the two dimensions to act as two independent dimensions without regard for the conflict at ([First Date].[Date].[All],[Second Date].[Date].[All]) since I will never query any value in that slice.

I'm not sure what it is I'm failing to communicate or understand - this seems like it should be straightforward and not particularly unusual. Lots of cubes have multiple date dimensions (and I expect lots of those have those date dimensions all backed up by a single role-playing dimension in the database).

Any suggestions are welcome - for now, I've abandoned the whole notion (being able to get time-based aggregations based on multiple time dimensions) as being inherently not supported by MDX/SSAS.

|||

It must be my failure to understand what you are trying to achive, and I appologize for that. It must something really simple, since it looks straigtforward to you, yet I fail to grasp it. Perhaps you should start a new thread, since this got plenty of replies leading nowhere, and people don't look at threads with complicated histories but do look at fresh threads.

One last note - the cubes with multiple Time dimensions are certainly not unusual, I've seen many of them working fine (even Adventure Works sample features 3 Time dimensions).

|||You didn't mention what the results of trying my earlier suggestion were - ie. creating a calculated member, rather than using an actual member, for [Period].[Period].[60 calendar days]?|||

Sorry, I forgot to reply to your suggestion.

This [Period] dimension is a real, physical dimension in the cube - and is so because I have some measure groups that associate directly with the period dimension (those "limit" measures that Mosha helped me with in another thread).

So, when I say [Period].[Period].[60 Calendar Days] is the same member as [Period].[Period].&[60], that's exactly what I mean - the member with key 60 has the name '60 Calendar Days'.

I really can't experiment with having this be a pure calculated member - I need that physical dimension to be this same dimension so that the relationships within the cube are correct.

Other than getting the "time based intelligence" to work in two time dimensions in the same cube, this all works beautifully. I have an inkling of how to get what I wanted - at the moment, I don't actually need to solve the problem right now, but sooner or later, I'll have to confront it.

I think that the solution is something like this:

Scope(
{
[Measures].[Amount],
[Measures].[Count]
}
);

( [Period].[Period].[60 calendar days],
[First Date].[Date].[Date].Members,
[Second Date].[Date].DefaultMember
) =

Aggregate(
{ [Period].[Period].DefaultMember } *
{ [First Date].[Date].CurrentMember.lag(59) : [First Date].[Date].CurrentMember }
);

( [Period].[Period].[60 calendar days],
[Second Date].[Date].[Date].Members,
[First Date].[Date].DefaultMember
) =

Aggregate(
{ [Period].[Period].DefaultMember } *
{ [Second Date].[Date].CurrentMember.lag(59) : [Second Date].[Date].CurrentMember }
);


End Scope;

If I get a chance to try it, I'll post back with the results.

|||

FYI - I put a similar test case together in Adventure Works, relating [Date] and [Ship Date] to the same fact date field. Using the [Scenario] dimension &[2] (Budget) member to hold trailing 60-day calculations, only calculations for the 2nd dimension (Ship Date) work. But if a new calculated member like [Scenario].[Scenario].[Trailing60] is used instead, calculations on both [Date] and [Ship Date] seem to work fine.

Since you might not wish to go down the calculated member path, another approach which seems to work is "freezing" the first calculation, to prevent its results being changed by the next calculation. In your scenario, something like:

Scope(
{
[Measures].[Amount],
[Measures].[Count]
},

[Period].[Period].[60 calendar days]
);

Scope( [First Date].[Date].[Date].Members );

this =

Aggregate(
{ [Period].[Period].DefaultMember } *
{ [First Date].[Date].CurrentMember.lag(59) : [First Date].[Date].CurrentMember }
);

freeze(this);

End Scope;

( [Second Date].[Date].[Date].Members ) =

Aggregate(
{ [Period].[Period].DefaultMember } *
{ [Second Date].[Date].CurrentMember.lag(59) : [Second Date].[Date].CurrentMember }
);


End Scope;

Monday, March 19, 2012

Composite Primary Key in fact table

I have a sales fact table with dimensions: dimDate, dimCustomer, dimSalesRep, dimProduct. I am using the surrogate key from each dimension as the composite primary key in my Sales fact table. I had a problem in that the same customer may order the same product multiple times from the same sales rep on the same date. In other words I had a situation where my composite primary key would not have been unique.

It was suggested in another thread that I add the Sales Order Number, which is a bigint identity field in the OLTP, as an additional field in the PK in order to guarantee uniqueness. This I've done.

My question is this: The Sales Order Number will guarantee a unique factSales row so do I need to include all of the surrogate keys in my composite primary key or just just Sales Order Number as the primary key and maintain the surrogate keys as attributes for use in loading data?

Thanks.

Logically speaking, the fact table does not have a composite key. You may have a degenerate dimension in the fact table (such as a Sales Order Number) but logically the primary key is really associated with that dimension. Lots of folks try to put a primary key constraint on their fact tables but you do not necessarily need it from a logical modeling perspective.

In your specific situation, I'd recommend your fact table include the Sales Order Number which would have a PK on it. Foreign key references to your other dimensions would reside in your fact table as well.

B.

|||

Hello! Have a look at www.kimballgroup.com for help with general design issues.

Normally you will create foreign keys from each dimension table to the fact table. Not a composite key.

If the customer order the same product several times on the same date from the sales representative you have the option to aggregate that information to the date level(you will only see totals for one date) or to use the time part of the date column.

It is an design issues for your business requirement.

HTH

Thomas Ivarsson

|||Thanks. I'm pulling data from 8 or 10 different OLTP tables so it'd me next to impossible to aggregate. The Sales Order ID is an identity field in the OLTP so it will always be unique. I'm using that as my primary key in my fact table.

I'm not sure where I got the idea for the composite Primary Key.......must be relational hangover from my OLTP design days......


I've used the Kimball group vid on the SSIS forum as the basis of my ETL design. I'll check out their design ideas. (I'm trying to work through the Melomed, et. all. SSAS book but I get the impression that the authors never heard of Visual Studio....Call me lazy but I'm not going to design my enterprise DW solution writing several hundred thousand lines of XMLA code.)

|||

The Melomed et al. book is really an "Inside SSAS"-type exploration of the technology. BIDS/VS really is just hiding the XMLA stuff from you with a slick interface.

Regarding the Kimball-standards, check out Chapter 5 of "The Data Warehouse Toolkit". I think it covers the exact star-schema you are attempting to create.

Good luck,

B.

|||I use the code veiw when I have to. I just wish they could relate the inside information to modern tools, perhaps refer to both the actual desing interface AND the underlying code. They could also refer to all the examples in binary but we have modern tools these days.....|||

From "The Microsoft Data Warehouse Toolkit" by Kimball Group, page 42: "The key to the fact table is a multi-part key made up of a subset of the foreign keys from each dimension table involved in the business event."

That would seem to say something different.

|||

Kimball also says you should model your data at the lowest level of granularity available.

So, if you assemble a model at the transaction level but do not include the transaction's primary key elements in your dimensions, setting a primary key on your foreign key may result in constraint violations. The only way to avoid that is to pre-aggregate your data to the level supported by your dimensions but you don't want to do that because of data lose (hence Kimball's mandate).

If you also review the models in The Data Warehouse Toolkit, you will not find many fact tables with primary keys defined on them.

B.

Composite Primary Key in fact table

I have a sales fact table with dimensions: dimDate, dimCustomer, dimSalesRep, dimProduct. I am using the surrogate key from each dimension as the composite primary key in my Sales fact table. I had a problem in that the same customer may order the same product multiple times from the same sales rep on the same date. In other words I had a situation where my composite primary key would not have been unique.

It was suggested in another thread that I add the Sales Order Number, which is a bigint identity field in the OLTP, as an additional field in the PK in order to guarantee uniqueness. This I've done.

My question is this: The Sales Order Number will guarantee a unique factSales row so do I need to include all of the surrogate keys in my composite primary key or just just Sales Order Number as the primary key and maintain the surrogate keys as attributes for use in loading data?

Thanks.

Logically speaking, the fact table does not have a composite key. You may have a degenerate dimension in the fact table (such as a Sales Order Number) but logically the primary key is really associated with that dimension. Lots of folks try to put a primary key constraint on their fact tables but you do not necessarily need it from a logical modeling perspective.

In your specific situation, I'd recommend your fact table include the Sales Order Number which would have a PK on it. Foreign key references to your other dimensions would reside in your fact table as well.

B.

|||

Hello! Have a look at www.kimballgroup.com for help with general design issues.

Normally you will create foreign keys from each dimension table to the fact table. Not a composite key.

If the customer order the same product several times on the same date from the sales representative you have the option to aggregate that information to the date level(you will only see totals for one date) or to use the time part of the date column.

It is an design issues for your business requirement.

HTH

Thomas Ivarsson

|||Thanks. I'm pulling data from 8 or 10 different OLTP tables so it'd me next to impossible to aggregate. The Sales Order ID is an identity field in the OLTP so it will always be unique. I'm using that as my primary key in my fact table.

I'm not sure where I got the idea for the composite Primary Key.......must be relational hangover from my OLTP design days......


I've used the Kimball group vid on the SSIS forum as the basis of my ETL design. I'll check out their design ideas. (I'm trying to work through the Melomed, et. all. SSAS book but I get the impression that the authors never heard of Visual Studio....Call me lazy but I'm not going to design my enterprise DW solution writing several hundred thousand lines of XMLA code.)

|||

The Melomed et al. book is really an "Inside SSAS"-type exploration of the technology. BIDS/VS really is just hiding the XMLA stuff from you with a slick interface.

Regarding the Kimball-standards, check out Chapter 5 of "The Data Warehouse Toolkit". I think it covers the exact star-schema you are attempting to create.

Good luck,

B.

|||I use the code veiw when I have to. I just wish they could relate the inside information to modern tools, perhaps refer to both the actual desing interface AND the underlying code. They could also refer to all the examples in binary but we have modern tools these days.....|||

From "The Microsoft Data Warehouse Toolkit" by Kimball Group, page 42: "The key to the fact table is a multi-part key made up of a subset of the foreign keys from each dimension table involved in the business event."

That would seem to say something different.

|||

Kimball also says you should model your data at the lowest level of granularity available.

So, if you assemble a model at the transaction level but do not include the transaction's primary key elements in your dimensions, setting a primary key on your foreign key may result in constraint violations. The only way to avoid that is to pre-aggregate your data to the level supported by your dimensions but you don't want to do that because of data lose (hence Kimball's mandate).

If you also review the models in The Data Warehouse Toolkit, you will not find many fact tables with primary keys defined on them.

B.

Composite Primary Key in fact table

I have a sales fact table with dimensions: dimDate, dimCustomer, dimSalesRep, dimProduct. I am using the surrogate key from each dimension as the composite primary key in my Sales fact table. I had a problem in that the same customer may order the same product multiple times from the same sales rep on the same date. In other words I had a situation where my composite primary key would not have been unique.

It was suggested in another thread that I add the Sales Order Number, which is a bigint identity field in the OLTP, as an additional field in the PK in order to guarantee uniqueness. This I've done.

My question is this: The Sales Order Number will guarantee a unique factSales row so do I need to include all of the surrogate keys in my composite primary key or just just Sales Order Number as the primary key and maintain the surrogate keys as attributes for use in loading data?

Thanks.

Logically speaking, the fact table does not have a composite key. You may have a degenerate dimension in the fact table (such as a Sales Order Number) but logically the primary key is really associated with that dimension. Lots of folks try to put a primary key constraint on their fact tables but you do not necessarily need it from a logical modeling perspective.

In your specific situation, I'd recommend your fact table include the Sales Order Number which would have a PK on it. Foreign key references to your other dimensions would reside in your fact table as well.

B.

|||

Hello! Have a look at www.kimballgroup.com for help with general design issues.

Normally you will create foreign keys from each dimension table to the fact table. Not a composite key.

If the customer order the same product several times on the same date from the sales representative you have the option to aggregate that information to the date level(you will only see totals for one date) or to use the time part of the date column.

It is an design issues for your business requirement.

HTH

Thomas Ivarsson

|||Thanks. I'm pulling data from 8 or 10 different OLTP tables so it'd me next to impossible to aggregate. The Sales Order ID is an identity field in the OLTP so it will always be unique. I'm using that as my primary key in my fact table.

I'm not sure where I got the idea for the composite Primary Key.......must be relational hangover from my OLTP design days......


I've used the Kimball group vid on the SSIS forum as the basis of my ETL design. I'll check out their design ideas. (I'm trying to work through the Melomed, et. all. SSAS book but I get the impression that the authors never heard of Visual Studio....Call me lazy but I'm not going to design my enterprise DW solution writing several hundred thousand lines of XMLA code.)

|||

The Melomed et al. book is really an "Inside SSAS"-type exploration of the technology. BIDS/VS really is just hiding the XMLA stuff from you with a slick interface.

Regarding the Kimball-standards, check out Chapter 5 of "The Data Warehouse Toolkit". I think it covers the exact star-schema you are attempting to create.

Good luck,

B.

|||I use the code veiw when I have to. I just wish they could relate the inside information to modern tools, perhaps refer to both the actual desing interface AND the underlying code. They could also refer to all the examples in binary but we have modern tools these days.....|||

From "The Microsoft Data Warehouse Toolkit" by Kimball Group, page 42: "The key to the fact table is a multi-part key made up of a subset of the foreign keys from each dimension table involved in the business event."

That would seem to say something different.

|||

Kimball also says you should model your data at the lowest level of granularity available.

So, if you assemble a model at the transaction level but do not include the transaction's primary key elements in your dimensions, setting a primary key on your foreign key may result in constraint violations. The only way to avoid that is to pre-aggregate your data to the level supported by your dimensions but you don't want to do that because of data lose (hence Kimball's mandate).

If you also review the models in The Data Warehouse Toolkit, you will not find many fact tables with primary keys defined on them.

B.

Composite Keys on Dimensions

I have a question about Composite Keys on Dimensions. I run the Dimension Health Checker that comes with BIDS Helper and it tells me I have many to many relationships happening, so I change key columns to multiple columns to get rid of the many to many, which is good.

Now, when I browse my cube, I pull in a Dimension property , say the lowest level of a heirarchy, and I see duplicates where before it was rolled up.

Is there an easy way to get rid of the many to many with composite keys yet still have the data rolled up to my name column?
i am guessing CustomRollupColumn is the answer? Smile
|||

Instead of using the key as the keycolum you could use the name of the attribute e.g.

key_size name_size

1 big

2 big

3 small

4 medium

if your keycolumn is key_size you will get four values if the keycolumn is name_size you will only get three where keys 1 and 2 are pointed to by the name_size big.

Hope that helps

Matt

|||that is how it was set up, but with the attribute relationships and heirarchies, the dimension health checker comes back with many to many relationships, so i have to make the key composite to get uniques across..

the dimensions are slowly changing collapsed snoflake dimensions..so there are possiblities for dupes
|||

The composite keys only really need to exist in the DSV to join the tables together, do they not? Each individual attribute may not need composite keys. Think I must have missed something here, any chance of a more detailed example, perhaps two of the tables.

Cheers

Matt

|||i have one dimension.. called Source

there is a heirarchy Dept->Account->Campaign

since the dimension is slowly changing there are duplicates when campaigns change

for my Campaign i have the keycolumn as Campaign and Account, name column as Campaign as the BIDS dimension health checker said there were many to many relationships occuring. When i changed the keycolumn to the two columns, then reran the health checker, the warning/error went away...

before i had the composite key column, i could drag campaign and say revenue onto a pivot table and it would rollup to campaigns, now when i do that, it looks like there are duplicates because really it is rolling up to the keycolumn (Campaign and Account).

What i want to do is still roll up to the campaign name even though I have the key column as campaign+account
|||

Hi,

Ok i think i understand, it should look ok in the hierarchy? yes. But if you browse the campaign attribute, it looks like there are duplicates, this is as expected. Two solutions, remove the composite key and do not make an attribute relationship between the attributes - thus giving you a little yellow triangle warning in the hierarchy.

OR, rename campaign to campaign2, create another attribute Campaign and don't use a composite key on that one, just the key campaign. If you want to browse the campaign list, just use campaign attribute but if you want to use the hierarchy use campaign2. In effect creating four attributes:

-Campaign

-Campaign2

-Account

-Dept

A hierarchy

-Dept

--Account

Campaign2 (change its name to Campaign to reduce confusing the users)

And perhaps in the perspectives hide the individual attribute campaign2 so no one ever see it. Bit of a work around, but removes the duplicates.

Not to sure if there is another way of doing it. Maybe someone else has a better idea.

Hope that helps

Matt

|||yes , for now i just removed the composite key and left the little warning in the heirarchies, but i would like to fix that as i want to design my aggregations and get best performance

your workaround might just work as well, something I can probably try...I dont know if there is another solution or something else to try, but you would think there would be a way to roll it up on a column instead of the key, i dunno..

thanks for all your help on this

Friday, February 24, 2012

Complex dimension question

I have two dimension; 1 is a 'Users' dimension which has information about users of our system, the other is a self-referencing dimension containing industry hierarchies.

The Users dimension is linked to the Industry dimension by Industry ID.

I tried adding the Industry hierarchy to the Users dimension but when I go to browse the hierarchy all I see is IDs of the DimUser table, so I think that I did something wrong.

My goal is to add the industry hierarchy to the User dimension.
Can I do this with my current architecture?

Here are some examples of the two dimension:

DimUsers
- ID
- Name
- IndustryID (FK -> DimIndustry.ID)

DimIndustry
- ID
- IndustryName
- ParentID (FK -> DimIndustry.ID)

Thanks in advance.

Preston,

You should be able to do this using the following steps:

1.) Edit the User dimension

2.) In the "Data Source View" pane of the dimension editor, and select "Show Tables..."

3.) Add the DimIndustry table to the data source view pane

4.) Drag the "IndustryID" from your DimUsers table as a new attribute.

5.) Change the "NameColumn" property to pull the "IndustryName" column from the DimIndustry table.

6.) Create your hierarchy using the attributes you want and process.

HTH,

- Steve