How to structure data for Tableau

This post discusses how to structure data for Tableau, as well as common mistake while structuring data.

What is data?

Data is information stored in digital format for efficient storage and retrieval.

Data can be stored in different formats, excel, csv files, relational database schemas, star schemas, snowflake schema and unstructured data like Fb, twitter or Tip advisor.

How Tableau handles data ?

Tableau can connect to variety of data sources. When data is extracted and loaded in tableau, tableau automatically segregates it into dimensions and measures.

What are Dimensions and Measures

Dimension is a descriptive attribute of the data. Dimension describes the data element. Measure or fact is a quantitative attribute of the data. You can create aggregates using a measure.

Dimension provide context to the data. Without dimension ‘Measure’ has no meaning.

Dimensions are discrete and Measure is continuous. Learn more about Dimensions and Facts

In Tableau, Dimensions create labels and Measure creates axis. In the example below, Category is a dimension and Sales is a measure.

Wrong data formats

Case 1 : Wrong format of excel or csv file

If you are loading an excel or a csv file,

  • All data columns should have header. All these headers should be on the same line.
  • There should be no blank lines before the header row.
  • Each row should represent data.
  • “Total” should not be part of any data column as Tableau will aggregate data for you.

An example of the wrongly formatted data will be :

Case 2 : Loading a cross-tab

Data can come from variety of sources and can be structured differently.

See the data set below:

In this dataset, as you can see Year columns are listed individually.

If this data is loaded in Tableau, it will be difficult to create an aggregation on Year. It will also be difficult to create a chart with Year by Year comparison.

Good way to structure this data will be to format it as :

Tableau provides tools like “Data interpreter” and “Pivot” to help you structure data in the correct .

Case 3 : Loading a report in Tableau

Sometimes you may have to load a report in Tableau, such as below

The problem with this dataset is :

-It has a report title which you don’t need for your dashboard

-Blank lines

-Employee ID is a concatenation of Emp ID and Region

-Years are listed as individual Years

Resolution

When this data is loaded, Tableau automatically recognizes the problem in data and try to resolve it for you using ‘data interpreter’

Check mark Use Data Interpreter and Tableau will automatically try to resolve the problems in data.

To format the Years vertically, use Pivot

You can also split Emp ID and Region using the Split option.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.