Tag Archives: MicroStrategy

Using the R Integration functionality, how to perform Text Mining on a MicroStrategy report and display the result

Readers:Jaime Perez

Here is another great post in the MicroStrategy Community from Jaime Perez (photo, right) and his team. A lot of work when into the preparation of this post and it shows some great ways to use the “R” integration with MicroStrategy.

Contributors from Jaime’s team include:

ssonobe  ssonobe

Lili 

Joanne A 

Ohingst  Ohingst

Enjoy!

Michael

Text Mining Using R Integration in MicroStrategy

Users may wish to perform text mining using R on the result of any arbitrary MicroStrategy report and display the result. One of the problems that hinders the users from achieving it is that the number of output elements is not always consistent. For example, a report may have three attributes named ‘Age groups’, ‘Reviewer’, and ‘Survey feedback’ and the report might display four rows of feedback as follows:

01.jpg

If the above report result is sent to R as an input and the R script breaks down each sentence of the feedback into the term frequency that is grouped by the age groups, it will have 18 rows.

02.jpg

Since the number of output elements is greater than the number of the MicroStrategy report rows, the report execution will fail. Using the objects in the Tutorial project, this technical note (TN207734) describes one way to display the result of text mining on a MicroStrategy report, using the R integration functionality.

PREMISE:
– Following the instructions in TN43665, the MicroStrategy R Integration Pack has already been installed on the Intelligence Server.

The Steps Involved

STEP 1: Decide on the input values that need to be sent to R via R metrics
The first step is to decide on which data you wish to perform text mining. In this technical note, the sample report will let users select one year element, the arbitrary number of category elements, and specify the Revenue amount in prompts. The report will then display the value of the normalized TF-IDF (term frequency and inverse document frequency) for every word showing up in the qualified Item attribute elements, grouped by the Category elements.

A user may select the following values for each prompt and the report may look as shown below.

  • Year: 2012
  • Category: Books, Movies, and Music
  • Revenue: greater than $15,000

03.jpg

Eventually, the user may want to see the normalized TF-IDF for every word showing up in the Item attribute elements as shown below:

04.jpg

Since the final output displays each word from the Item attribute and it is grouped by the Category elements, the necessary input values to R are as follows

  • The elements of the Category attribute.
  • The elements of the Item attribute.

 

STEP 2: Create metrics to pass the input values to R

The input values to R from MicroStrategy must be passed via metrics. Hence, on top of the current grid objects, additional metrics need to be created. For this sample report, since the inputs are the elements of two attributes, create two metrics with the following definitions so that the elements are displayed as metrics.

Max(Category@DESC) {~}

Max(Item@DESC) {~}

05.jpg

 

STEP 3: R script – Phase 1: Define input and output variables and write R script to obtain what you wish to display in a MicroStrategy report

In the R script, define (1) a variable that receives the inputs from MicroStrategy and (2) a variable that will be sent back to MicroStrategy as the output as depicted below. Since the number of output elements must match with the number of input elements, it is defined as “output = mstrInput2” to avoid the errors. In other words, this script executes R functions to obtain the data that you wish to display in a MicroStrategy report, but the output is the same as the input. More details about how to display the result in a MicroStrategy report will be followed up later in this technical note.

06.jpg

 

In this technical note, after manipulating the input value, we assume that the variable named ‘norm.TF.IDF’ in the R script holds the values of the TF-IDF for each term.

07.jpg

 

STEP 4: Create tables in the data warehouse to store the value of your R output

In order to display the values of the ‘norm.TF.IDF’ defined in a MicroStrategy report, tables to hold the result need to be created in the data warehouse. In other words, additional report will later have to be created in MicroStrategy and it will extract the data from the database tables, which are created in this section.

In this specific example, the variable ‘norm.TF.IDF’ has the elements of words (terms) and categories and the values of the normalized TF-IDF. Considering the types of data, the first two should be displayed as attributes and the values of the normalized TF-IDF should be presented in a metric. Hence, two lookup tables to hold the term and category elements and one fact table need to be created to store all the data. On top of these tables, one relationship table is also required since the relationship between words and categories is many-to-many.

 

STEP 5: R script – Phase 2: Populate the tables in your R script

As previously mentioned, the variable named ‘norm.TF.IDF’ contains the values, which a user wishes to display in a MicroStrategy report as shown below.

07.jpg

 

In this R script, four more variables are defined from ‘norm.TF.IDF’, each of which contains the subset of data that will be inserted into the database tables.

 

tm_Category holds the unique elements of the Category.

10.jpg

 

tm_Word holds the unique elements of the Word (Term).

11.jpg

 

tm_Word_Cat stores the values of the many-to-many relationship.

12.jpg

 

tm_Fact contains the values of TF-IDF for every Word-Category combination.

13.jpg

 

In the R script, populate the database tables with the above four subsets of ‘norm.TF.IDF’.

# Load RODBC
library(RODBC)

# RODBC package: assign ch the connectivity information
ch <- odbcConnect("DSN_name")

# Delete all the rows of the tables
sqlClear(ch, "tm_Category", errors = TRUE)
sqlClear(ch, "tm_Word",     errors = TRUE)
sqlClear(ch, "tm_Word_Cat", errors = TRUE)
sqlClear(ch, "tm_Fact",     errors = TRUE)

# SQL: insert the data into tables; use parameterized query
sqlSave(ch, tm_Category, tablename = "tm_Category", rownames=FALSE, append=TRUE, fast = TRUE)
sqlSave(ch, tm_Word,  tablename = "tm_Word", rownames=FALSE, append=TRUE, fast = TRUE)
sqlSave(ch, tm_Word_Cat, tablename = "tm_Word_Cat", rownames=FALSE, append=TRUE, fast = TRUE)
sqlSave(ch, tm_Fact, tablename = "tm_Fact", rownames=FALSE, append=TRUE, fast = TRUE)

#Close the channel
odbcClose(ch)

 

STEP 6: Create and add an R metric, which implements the R script

The R script is done. It is time to implement this R script from MicroStrategy by creating an R script. In the deployR interface, open the R script and define the input and output that you specify in Step 3 as follows. Since the elements of the Category and Item attributes are characters, choose “String” as its data type. Likewise, since the output is the same as the mstrInput2, its data type is also set to string.

14.jpg

 

Create a stand-alone metric and paste the metric definition of the deployR utility. Then, replace the last parameters by the Category and Item metrics that you created in Step 2.

15.jpg

 

Add the R metric to the report.

15.2.png

 

The report and R will perform the following actions after adding the R metric
i. The report lets users select the prompt answers
ii. MicroStrategy sends the Category and Item elements to R via the R metric
iii. R performs text mining to calculate the TF-IDF based on the inputs
iv. R generates subsets of the TF-IDF
v. R truncates the database tables and populates them with the subset of the TF-IDF
vi. R sends the output(which is actuary the input) to MicroStrategy
vii. The report displays the values of all object including the R metric

 

STEP 7: Create MicroStrategy objects to display the data

From the tables created in Step 4, create the Word and Category attributes and the fact named weight. The object relationship is as depicted below.

08.jpg

09.jpg

 

Now, create a new report with these objects. This report will obtain and display the data from the database tables.

16.jpg

 

STEP 8: Utilize the report level VLDB properties to manipulate the order of the report execution jobs

There are currently two reports and let each of which to be named R1 and R2 as described below

  • R1: A report which prompts users to specify the report requirements and implements the R script executing text mining
  • R2: This report obtains the result of text mining from the database and display it

 

If the two reports are placed in a document as datasets as shown below, there is one problem: R2 may start its execution before R1 populates the database tables with the result of text mining.

17.jpg

 

In order to force R2 to execute its job after the completion of R1, the VLDB properties PRE/POST statements along with additional database table may be used. The table tm_Flag contains the value of 0 or 1. R2 is triggered when R1 sets the value of completeFlag to 1. The detailed steps are described below with the script for SQL Server.

 

i. Create another table in the database, which holds the value of 1 or 0

CREATE TABLE tm_Flag
(
   completeFlag int
)


INSERT INTO tm_Flag VALUES(0)

 

ii. In the VLDB property ‘Report Post Statement 1” of the R1 report, defines a Transact-SQL statement that changes the value of completeFlag to the value of 1.

DECLARE @query as nvarchar(100)
SET @query = 'UPDATE tm_Flag SET completeFlag = 1'
EXEC sp_executesql @query

 

iii. Define the VLDB property ‘Report Pre Statement 1’ in R2 so that it will check the value of completeFlag every second and loop until it turns to 1. After the loop, it will revert the value of completeFlag back to 0. After this Report Pre Statement, R2 will obtain data from the database, which has been populated by R1.

DECLARE @intFlag INT
SET @intFlag = (select max(completeFlag) from tm_Flag)

WHILE(@intFlag = 0)
BEGIN
	WAITFOR DELAY '00:00:01'
	SET @intFlag = (select max(completeFlag) from tm_Flag)
END

DECLARE @query as nvarchar(100)
SET @query = 'UPDATE tm_Flag SET completeFlag = 0'
EXEC sp_executesql @query

 

Activity Diagram

18_revised.png

 

 

Overall execution flow

  1. Answer prompts

19.png

 

2. Only the text mining result is displayed to users

20.png

 

Third Party Software Installation:

WARNING: The third-party product(s) discussed in this technical note is manufactured by vendors independent of MicroStrategy. MicroStrategy makes no warranty, express, implied or otherwise, regarding this product, including its performance or reliability.

 

Bryan Brandow: Triggering Cubes & Extracts using Tableau or MicroStrategy

trigger-720x340

bryan-headshots-004Bryan Brandow (photo, right), a Data Engineering Manager for a large social media company, is one of my favorite bloggers out their in regards to thought leadership and digging deep into the technical aspects of Tableau and MicroStrategy. Bryan just blogged about triggering cubes and extracts on his blog. Here is a brief synopsis.

One of the functions that never seems to be included in BI tools is an easy way to kick off an application cache job once your ETL is finished. MicroStrategy’s Cubes and Tableau’s Extracts both rely on manual or time based refresh schedules, but this leaves you in a position where your data will land in the database and you’ll either have a large gap before the dashboard is updated or you’ll be refreshing constantly and wasting lots of system resources. They both come with command line tools for kicking off a refresh, but then it’s up to you to figure out how to link your ETL jobs to call these commands. What follows is a solution that works in my environment and will probably work for yours as well. There are of course a lot of ways for your ETL tool to tell your BI tool that it’s time to refresh a cache, but this is my take on it. You won’t find a download-and-install software package here since everyone’s environment is different, but you will find ample blueprints and examples for how to build your own for your platform and for whatever BI tool you use (from what I’ve observed, this setup is fairly common). Trigger was first demoed at the Tableau Conference 2014. You can jump to the Trigger demo here.

I recommend you click on the link above and give his blog post a full read. It is well worth it.

Best regards,

Michael

Tips & Tricks #13: Star Schema in MicroStrategy

Star Schema 1

A Star Schema is a design that contains only one lookup table for each hierarchy in the data model instead of having separate lookup tables for each attribute. With only a single lookup table for each hierarchy, the IDs and descriptions of all attributes in the hierarchy are stored in the same table. This type of structure involves a great degree of redundancy. As such, star schema are always completely denormalized. Let’s review the star schema above based on the MicroStrategy Tutorial data model.

The schema contains only two lookup tables, one for each hierarchy. LU_LOCATION stores the data for all of the attributes in the Location hierarchy, while LU_CUSTOMER stores the data for all of the attributes in the Customer hierarchy. As a result, star schemas contain very few lookup tables-one for each hierarchy present in the data model. Each lookup table contains the IDs and descriptions (if they exist) for all of the attribute levels in the hierarchy.

Even though you have fewer tables in a star schema than a snowflake, the tables can be much larger because each one stores all of the information for an entire hierarchy. When you need to query information from the fact table and join it to information in the lookup tables, only a single join is necessary in the SQL to achieve the desired result.

Joins in a Star Schema

As an example, if you run the same report to display customer state sales, only one join between the lookup and fact table is required to obtain the result set as illustrated below.

Star Schema 2

To join the Customer State description (Cust_State_Desc) to the Sales metric (calculated from Sales_Amt) requires only one join between tables since the Customer State ID and description are both stored in the LU_CUSTOMER table. As a result, the query has to access only one lookup table to obtain all of the necessary information for the report.

Even though achieving this result set requires only a single join, star schemas do not necessarily equate to better performance. Depending on the volume of data in any one hierarchy, you may be joining a very large lookup table to a very large fact table. In such cases, more joins between smaller tables can yield better performance.

Characteristics of a Star Schema

The following is a list of characteristics of a star schema.

  • Contains fewer tables (one per hierarchy)
  • Contains very large tables (much larger than some forms of snowflake schemas due to storing all attribute ID and description columns)
  • Store the IDs and descriptions of all the attributes in a hierarchy in a single table
  • Requires only a single join when querying fact data regardless of the attribute level at which you are querying data

—————————————————————————

Source: MicroStrategy University, MicroStrategy Advanced Data Warehousing, Course Guide, Version: ADVDW-931-Sep13-CG.

New! The MicroStrategy Community

MicroStrategy Community

Source: Welcome to the new MicroStrategy community! , MicroStrategy Community, September 8, 2014, http://community.microstrategy.com/t5/Community-News/Welcome-to-the-new-MicroStrategy-community/bc-p/198140#M84.

We’re very happy that you are here. We read and analyzed feedback from many events and surveys concerning our current Discussion Forums and Knowledge Base. Today, we are proud to present to you the new MicroStrategy community. We’re very excited to have reached this milestone after a lot of hard work over the past six months.

Our primary goal with our new community is to ensure the best possible experience for you when interacting with everything MicroStrategy. Not only have we enhanced the look and feel, but also merged the data of our Discussion Forums and our Knowledge Base into a single site. Our search capabilities have expanded to make our information easier to find and more accessible.

Note that not everything from the old discussions forums is held true here in the new community. For instance, we have a new ranking structure and other fun elements such as badges. There are also some changes in terminology. Previously, for posts marked as “helpful,” now posts are given a “kudo.” Also, instead of “right answer” we now have “accepted solutions.”

Thanks for taking part in our beta launch. We hope it offers you a fresh perspective on what MicroStrategy can be as we focus on delivering a world-class customer experience that is simple, transparent, and empowering for everyone.

MicroStrategy’s Community Team is always open to hearing from you. Post in the Community Feedback board to share your thoughts or message us to address your personal concerns. We are always looking for more ideas and suggestions because the more we share, the better we grow and design the community.

Post away!

Lili and Daphne
MicroStrategy Community Managers

Interview Question #8 : Many-to-Many Relationships in MicroStrategy

Question

Which of the following issues can result from many-to-many relationships?

A. Exclusion of some attribute elements when drilling

B. Multiple join paths to fact tables

C. Missing values on reports including all attributes from the hierarchy

D. Multiple counting when aggregating data from base fact tables

E. Lost analytical capability

Answer

Both

D. Multiple counting when aggregating data from base fact tables

      and

E. Lost analytical capability

 

Challenges of Many-to-Many Relationships

Because many-to-many relationships require distinct relationship tables, you have to design the logical data model and data warehouse schema in such a way that you can accurately analyze the relationship in regard to any relevant fact data.

If the structure of your logical data model and data warehouse schema does not adequately address the complexities of querying attribute data that contains many-to-many relationships, you can have problems like lost analytical capability and multiple counting.

I will be exploring both of these topics more next week as Tips and Tricks.

MicroStrategy Course Where You Will Learn About This Topic

MicroStrategy Advanced Data Warehousing Course

MicroStrategy Leads in Forrester Wave Agile BI Report, Q2, 2014

Forrester Wave Agile BI Q2 2014

MicroStrategy Analytics Platform received top scores for the features technology professionals need to enable business user business intelligence (BI) self-service, as well as for the effectiveness of its advanced data visualization (ADV) functionality in the recently published Forrester Wave: Agile Business Intelligence Report for Q2, 2014.

According to the report, “Forrester also scored MicroStrategy highly for the business user capabilities to provision applications and data and perform data integration tasks within the BI tool. MicroStrategy received high client feedback scores for its agile, business user self-service and ADV functionality. Clients also gave MicroStrategy a top score for its product vision.”

The report also stated that “in addition to its Agile BI offerings, MicroStrategy’s traditional strengths are its organically grown architecture and a powerful ROLAP engine, which in the long-term can often reduce total cost of ownership by reducing the number of reports and dashboards that need to be produced. With its integrated desktop and cloud-based SaaS offerings, MicroStrategy buyers can start small and scale quickly.”

To get a free download of the complete report, visit MicroStrategy’s Web site at http://www.MicroStrategy.com.

Michael Saylor Keynote – MicroStrategy World Barcelona – July 2014

Click on image to watch keynote video

Click on image to watch keynote video

MicroStrategy Simplifies Product Packaging to Enhance Total Customer Experience

New MicroStrategy Pricing

Converges on Four Products; Offers Free Upgrades to Premium Capabilities for Existing Clients

BARCELONA, Spain, July 8, 2014 – MicroStrategy® Incorporated (Nasdaq: MSTR), a leading worldwide provider of enterprise software platforms, today announced a new packaging structure aimed at delivering the best end-to-end customer and partner experience, making it easier than ever to acquire, deploy, and succeed with MicroStrategy. MicroStrategy also announced that it is extending free upgrades for existing clients to the premium capabilities included in the new product packaging, offering greater value to clients and new users.

The packaging changes will empower new and existing MicroStrategy clients to realize the full potential of their analytical applications using the most comprehensive analytics and mobile platforms in the industry. This information, and more, can be found at: www.microstrategy.com/experience.

“Our new packaging makes it simple for organizations to choose MicroStrategy for the totality of their business analytics and mobile application needs,” said Paul Zolfaghari, President, MicroStrategy Incorporated. “We believe it instantly enhances our value to existing customers and is emblematic of our heightened focus on delivering a positive customer experience. The new packaging allows better preparation and planning for new deployments, providing more value over the broad range of solutions we offer.”

Under the new packaging structure, MicroStrategy’s full feature set, previously split into 21 discrete offerings, has been reduced to four simple packages that empower developers, analysts, power users, and consumers to take advantage of the comprehensive MicroStrategy platform with simplified value-based pricing.

“From a customer perspective this is a welcome change,” said Andrew Young, BI Director, at Bob Evans Farms, a MicroStrategy client. “Budgeting and planning new applications will be far easier, especially breaking down platform investments to our business customers. With the simplified offering and pricing structure we can paint a more complete picture and focus on the business value.”

MicroStrategy added that the new packaging allows clients to more affordably deploy the full breadth of MicroStrategy capabilities (including data federation, write-back, closed-loop analysis, and automated report distribution, among others) to more users across the enterprise, giving end users full authoring capabilities as needed and integration with Microsoft® Office® applications. System architects gain efficiencies with the full ability to manage upgrades, migrations, and data loads, as well as free server administration and monitoring features. Within the four product offerings, clients will have all styles of analytics (self-service, dashboards, advanced analytics) across any interface (web, mobile, pdf, email report distribution) at Big Data scale—on an automated platform.

“To maximize the value of a customer relationship requires that companies simplify the pricing to ensure the purchasing process of technology is easy and transparent,” said Mark Smith, CEO and Chief Research Officer at Ventana Research. “The new MicroStrategy packaging and pricing enable the best possible customer experience, while shortening the time to gain full value from technology for organizations.”

The new packaging focuses on user roles within an enterprise:

  • MicroStrategy Server™ benefits all user roles. It includes a fully-featured server infrastructure designed to connect to multiple data sources, supports all major analytic styles from report distribution to information-driven apps to self-service data discovery, and scales to hundreds of thousands of users. It also includes administration and monitoring tools needed by organizations to effectively and efficiently manage their deployments.
  • MicroStrategy Web™ empowers business users to consume, author, and design analytics through an intuitive web-based interface. Business analysts can use MicroStrategy Web to take advantage of the all-inclusive set of self-service analytic capabilities.
  • MicroStrategy Mobile™, the award-winning, industry-leading interface for Apple iOS and Android devices, is an easy, fast, affordable way to mobilize analytics and information-driven applications to an increasingly mobile and 24 x 7 workforce.
  • MicroStrategy Architect™ provides developers with an extensive set of development, deployment and migration tools needed to efficiently manage the application development lifecycle.

The Company also noted that these four offerings complement the free MicroStrategy Analytics Desktop™ it made available last year. MicroStrategy Analytics Desktop is a free self-service business analytics tool designed to enable any individual user to gain deep insight into the user’s data by effortlessly creating powerful, insightful visualizations and dashboards.

MicroStrategy Report Optimization: Computational Distance

Computational Distance

Source: MicroStrategy University, Deploying MicroStrategy High Performance BI, V9.3.1, MicroStrategy, Inc. September, 2013.

Computational Distance

Any BI system consist of a series of processes and tools that take raw data at the very bottom-at the transaction level in a database-and by using various technologies transform that data into the finished answer that the user needs. At every step along the way, some kind of processing is done in the following components-the database, network, BI application, or the browser.

The concept of “computational distance” refers to the length in terms of systems, transformations, and other processes that the data must undergo from its lowest level, all the way to being rendered on a browser as shown in the image above.

The longer the computational distance is for a given report, the longer it will take to execute and render. The preceding image shows a hypothetical example of a report that runs in 40 seconds. Each processing step on that report, such as aggregation, formatting, and rendering, adds to the report’s computational distance, increasing the report overall execution time.

Reducing the Computational Distance of a Report

Computation distance offers a useful framework from a performance optimization perspective because it tells us that to improve the performance of a report or dashboard, you must focus on reducing its overall computational distance. The following image shows different techniques such as caching, cubes, and aggregation that can be used to optimize performance for the 40 second hypothetical report.

In the next blog post, we will next look at two key computational distance reduction techniques offered in the MicroStrategy platform-caching and Intelligent Cubes.

Reducing the Computational Distance

MicroStrategy Report Optimization: Components of Performance

Readers:

Today, I am adding the second post in my MicroStrategy Report Optimization series. This will be a multi-part series (I will leave it open-ended so I can continue to add to it).

Today, we will look at the components that comprise performance.

Best Regards,

Michael

Source: MicroStrategy University, Deploying MicroStrategy High Performance BI, V9.3.1, MicroStrategy, Inc. September, 2013.

Components of Performance

There are five key layers or components that a typical BI query must go through. They are:

  • Caching Options
  • Data Transfer
  • System Architecture and Configuration
  • Client rendering or Data Presentation
  • Data Warehouse Access

The components above are not listed in any specific order of access during the execution of a query. The image below illustrates the five components.

The Components of High Performance

Caching and Intelligent Cubes

MicroStrategy’s memory technology is engineered to meet the increased demand for higher BI performance, which is driven by the rapid expansion of both data volumes and the number of BI users in organizations across industries. MicroStrategy accelerates performance by pre-calculating computations and placing the results into its memory acceleration engine to dramatically improve real-time query performance.

Data Transfer

Data transfer over one or more networks are a very important component of a BI implementation. A slow or poorly tuned network performance in any of those transfers will translate into poor performance from a report or a dashboard execution perspective.

System Architecture and Configuration

Successful BI applications accelerate user adoption and enhance productivity, resulting in demand for more users, data, and reports. MicroStrategy provides the ability to adapt quickly to constant changes and evolve along with business requirements. MicroStrategy Intelligence Server hs been proven in real-world scenarios to deliver the highest performance at scale with the fewest servers and minimum IT overhead.

Data Presentation

Dashboards provide graphical, executive views into KPIs, enabling quick business insights. MicroStrategy enables higher performing dashboards, averaging 30-45% faster execution and interactivity. Using new compression methods, MicroStrategy dashboards have a smaller footprint than ever before – up to to 55% smaller – resulting in faster delivery using less network bandwidth. Dashboards deliver ever more analysis and data for end-users.

Data Warehouse Access

High performance BI starts with optimizing SQL queries to retrieve results from the database as quickly as possible. BI performance is dependent largely on the time that the queries take to execute in the database. An average reporting request usually takes 40 seconds to complete, out of which 34 seconds, or 85% of the query time, is spent executing in the database.

Therefore, it is critical to optimize report queries to reduce database execution time.

Follow

Get every new post delivered to your Inbox.

Join 98 other followers