Ph US: 323 521 4640 Asia Pacifc: +61 407 698 591 contact@dimodelo.com

Data Warehouse Automation in Visual Studio 2015


Dimodelo Architect is ‘Data Warehouse Automation’ development tool that targets Microsoft SQL Server. Simply design a data warehouse within Dimodelo Architect and it will generate the SQL code to maintain your Staging/Data Warehouse databases, and SSIS packages. Dimodelo Architect will manage the deployment of code to multiple server environments, and manage the ETL batch runs. Dimodelo Architect is a plug-in to Visual Studio 2015. It’s familiar to developers and allows you to keep your entire solution in one place (i.e. Dimodelo Architect, SSAS, SSRS projects side by side in the same Visual Studio Solution).

Data Warehouse Designer


Star Schema Wizard
click image to zoom

Data Warehouse Designer

Designing a Data Warehouse is as simple as drag and drop. Dimodelo Architect’s visual design wizard allows you to design your Star Schema. As you Drag and Drop, Dimodelo automatically determines relationships in your source system and suggests Dimension tables. Dimodelo allows you to refine your design before the design Meta Data is generated. Step through detail pages to define included columns, types, relationships and ETL patterns, and then generate the design. Once complete, you can further refine the design by opening each Staging, Dimension or Fact table through the Solution Explorer.

Quickly define Staging Tables and Extracts

Rapidly design staging tables by importing table and column schema information from source connections. Support for any type of Source data:

  • Database – SQL Server, Oracle, DB2, MySQL etc.
  • ODBC – SharePoint, Dynamics, Salesforce, Quickbooks, Hadoop etc etc etc.
  • File – Delimited files, Excel.
  • XML.

Extract Patterns: Dimodelo is “Pattern” driven. Simply chose the Pattern you want to a apply to the Extract, and the appropriate code is generated.

  • Full Extract.
  • Incremental (Delta) Extract.
  • Date Range ( Last N Days) Extract.
  • File Extract. Imports files that match a file name pattern and manages those files.

Advanced Features:

  • Define ColumnStore indexes.
  • Define Custom indexes.
  • Define multiple extracts for a staging table (from multiple sources) and append data.
  • Tables are created within Schema in the database. Configure the schema of each entity type.

Views

  • Create Views in the Staging layer. Use those Views as the source of Dimensions and Facts.
Staging Table Staging Columns
click image to zoom
Extract Definition
click image to zoom

Define a Staging table extract in minutes

  • Select either a table, or a query as the source of the extract.
  • Define complex source queries.
  • Auto map Source to Target columns.
  • Use SQL expressions in mappings.
  • Define either Full, Incremental, Multi File or Date range extracts.
  • Handle complex extract logic (allocation, aggregation etc) using derived staging tables (i.e. staging tables derived from other staging tables).
  • Define multiple extracts for a staging table (from multiple sources) and append data.

Staging - Map Source columns to Staging columns

Dimodelo Architect will automatically map Source to Staging columns by name. Alternately you can manually map columns, or use SQL expressions in the mapping.
Staging Mapping Columns
click image to zoom
Dimension Attributes
click image to zoom

Quickly design Dimension tables

  • Import the column schema for a Dimension from a source system table or view, including column names and data types.
  • Define business keys and attributes for the Dimension including attribute descriptions, data types, slowly changing types, parent child hierarchies etc.
  • Further customize your dimension by defining custom surrogate key names and data types, and surrogate key generation rules. Dimodelo Architect supports smart keys (like you might use in a calendar dimension) where the surrogate key is set from a column in the Staging source.
  • Add custom meta data for more complex requirements or custom code generation.
  • Add descriptions to the dimension and attributes. Descriptions are included in the generated documentation.
  • Define the Dimension as a ColumnStore index.
  • Define Custom indexes.
  • Use an expression to reject rows in staging to enforce data quality.

Quickly design Fact tables

  • Import the schema for a Fact from a source system table, including name, data types.
  • Identify the business keys and add/delete/modify measures.
  • Add your own descriptions and references back to requirements documentation. Descriptions are included in generated documentation.
  • Add custom meta data for more complex design requirements.
  • Define the Fact table as a ColumnStore index.
  • Define Custom indexes.
  • Use an expression to reject rows in staging to enforce data quality.
Fact Measures Tab
click image to zoom
Dimension Attributes
click image to zoom

Add Dimensionality to your Fact

  • Select the Dimensions to associate to a Fact.
  • A Fact can be associated to the same Dimensions multiple times to easily define role plays.
  • Associating a Fact with a Dimension means the Fact table is generated with a foreign key to the Dimension, and the generated ETL looks up the appropriate Dimension member for each Fact row.
Pattern Selection

Define Fact/Dimension ETL - Select your Pattern

Dimodelo Architect uses sophisticated pattern driven code generation. When you define the ETL for a Fact or Dimension, you simply select the kind of ETL (i.e. Pattern”) you want to be generated for that Fact/Dimension. In most cases the Standard Pattern for Facts is all you need. Occasionally where full history for a fact is required (in addition to Dimension history) you use the Fact Standard Ledger pattern. All our patterns are documented in our Patterns guide. In some cases you can specify different behavior for the pattern through custom meta data e.g. date ranges for very large transaction style facts. It is even possible to define your own patterns and code generators using EzAPI. Our patterns will cover 99% of what you need.

Define Fact/Dimension ETL - Source Staging tables

The first step in defining the design of ETL for a Fact or Dimension is to define a primary staging source table and, if required, multiple secondary staging source tables. You can specify how the Staging tables are joined and any filters to apply to the Staging table data. By default the join from primary to secondary tables is a LEFT join, which is typically what is required, however, this can be modified to a FULL INNER or OUTER join. The Staging Sources are used to generate the Source query for the Fact/Dimension transformation SSIS package data flow source.
Staging Source
click image to zoom
Matches
click image to zoom

Define Fact/Dimension ETL - Source to Target matching

Every Fact/Dimension transformation needs to define how incoming rows from the Staging source are matched against existing rows in the Fact/Dimension. Dimodelo Architect auto matches columns based on name, which is the typical scenario. You can also manually specify the mapping or define SSIS expressions for matching through our SSIS expression editor.

Define Fact/Dimension ETL - Column Transformations

  • Define the mapping between staging source columns and target Dimension attributes or Fact measures.
  • Use one to one column mappings or Expressions.
  • Auto match based on name.
  • Under the advanced tab you can define the mapping between a staging column and a custom ‘smart’ surrogate key.
Transformations
click image to zoom
Expressions
click image to zoom

Define Fact/Dimension ETL - Expression Editor

You can use the inbuilt SSIS expression editor to define SSIS expressions that map source columns to target columns. The editor provides a drag and drop interface for SSIS functions (much like the expression editor in an SSIS project). It also allows you to drag and drop column names.

Define Fact/Dimension ETL - Dimension Lookups

For Facts, it is necessary to define which column(s) or expressions to use to look up the corresponding member of associated dimensions. Again Dimodelo Architect assists by providing automated mapping by column name. You can also manually map columns or use SSIS expressions in the mapping. You can also specify a filter to apply to the Dimension records prior to look up.
 Dimension Lookups
click image to zoom

Generate and Deploy


Generate and Deploy

Dimodelo Architect not only generates the code for your Data Warehouse and ETL, it will also deploy it to the Server and execute the ETL Batch. All from within the development environment. Using MSBUILD and/or TFS you can automate the build and deployment of a Dimodelo Architect project. Dimodelo Architect is like any other Visual Studio project, behind the scenes the code is simple XML files that can be placed under source control for multi-developer collaboration.
Generate, Deploy and Run
Generate
click image to zoom

Generate

No coding required. Dimodelo will generate all the code to create and maintain the Staging and Data Warehouse databases, and ETL SSIS packages. SQL Server 2008, 2012, 2014 and 2016 SSIS packages can be generated. Generation is executed from the development environment or via MSBUILD. Dimodelo Architect generates:

  • Staging Tables DDL synchronization.
  • Data Warehouse Tables DDL synchronization.
  • Extract, Transform and Load (ETL) SSIS Packages.
  • Data Warehouse Documentation.

Dimodelo makes maintaining a Data Warehouse far easier. Simply change a column, data type, mapping, expression etc., and re generate. Your change is propagated throughout the code.

Generate - SSIS Package Generation

Dimodelo Architect generates high performance, readable and  industry best practice SSIS Packages to implement the ETL for SQL Server 2008, 2012, 2014 and 2016. Our SSIS package patterns are battle hardened and real world tested.

  • We use techniques to ensure high performance, including only performing bulk updates, checksums, early elimination, and little known query hints for bulk inserts.
  • Real world testing has shown performance to be approx 1 million rows per minute or > 16000 rows per second on large fact tables with heavy insert load.
  • Our patterns address every conceivable ETL scenario including things like soft deletions, reinstatement, and late arriving dimensions.
Generate - SSIS Package Generation
click image to zoom
Generate - Non Destructive DDL Generation

Generate - Non Destructive DDL Generation

It is important that data is not destroyed when the Data Warehouse schema is updated. Other existing schema synchronization tools implement a column name change as a drop and create column pair destroying data in the existing column as they do so. Dimodelo advanced DDL generation methods preserves data, recognizes name changes, and generates dynamic scripts which test the existing state of your data warehouse, creating and modifies tables and columns as required. Dimodelo uses soft deletes to delete columns and tables, so data is never deleted unless you explicitly do so.

Deployment Made Simple

Dimodelo Architect makes code deployment simple. One click is all it takes to deploy your Data Warehouse and ETL code from the UI. You can choose to deploy Data Warehouse Code, ETL Code or both. Through project configurations you can deploy to multiple environments, with prominent warnings if you are deploying to  a non development environment. You can also use MSBUILD to automate deployments via TFS. Dimodelo Architect will even deploy custom code with the solution.
Deployment Made Simple
Deployment - Multiple Environment Support

Deployment - Multiple Environment Support

Dimodelo eases the pain of migrating code though multiple Development, Test and Production environments. Through configuration files, you can define different connections string and other variables for each environment. You can also automate Data Warehouse deployment to multiple environments using MSBUILD and the Dimodelo Architect compiler.

ETL batch execution, logging and management


dashboard

ETL Batch Analysis and Monitoring

Dimodelo Architect records ETL batch task execution information to an SQL database. Dimodelo Architect comes with some pre-configured PowerBI desktop dashboards (you don’t need PowerBI.com) that you can use to monitor and analyze you ETL batch processes. Dashboards include:

  • Latest batch run. Typically you would look at this dashboard every morning to determine the success of the overnight batch.
  • The Batch query dashboard. Allows you to query details of any batch by date and type of workflow.
  • Batch Performance dashboard. Allows you to analyze the trend in batch duration over time, and the average duration of phases and tasks over all batches.
  • The batch database captures table statistics on inserts/updates to each table on each batch execution.

Easy ETL Workflow definition

No more tiresome master SSIS packages to maintain every time a new SSIS package is added to the ETL batch. Dimodelo Architect uses a simple XML file to define the phases and tasks within the ETL batch workflow. Tasks within phases run in parallel. Dimodelo Management Console uses the workflow file to orchestrate batch execution. Task names are wild-carded, so newly generated SSIS packages are automatically included in the batch. You can define multiple workflows, for your regular daily, weekly, monthly ETL jobs. Workflows allow you to run SSIS packages and stored procedures. You can even have your own custom code execute as part of the workflow.
workflow
Dimodelo on the Server

Dimodelo Management Console

Dimodelo Management Console is a server side Batch Execution application which executes and manages ETL batches on a Server. Dimodelo Management Console will saves having to develop your own ETL Batch execution and logging system. Dimodelo Management Console orchestrates ETL tasks on your Microsoft SQL Server, scheduling tasks in the correct order (based on a workflow file), and managing status updates and failures. The Batch execution framework uses an SQL Server  database to record batch progress. Dimodelo Manager will:

  • Execute the ETL in order, according to the workflow defined in a workflow file.
  • Employ dynamic Server utilization. Choose the number of tasks (SSIS Packages) to execute concurrently according to your server capacity.
  • Log task status.
  • Gather Batch statistics.
  • Handle multiple batch schedules i.e. Daily, Weekly, Monthly.

Dimodelo Management console can be scheduled to run via either an SQL Server Agent Job, or Windows Task Scheduler.

Request Demo
Request Demo

Request Demo

If you provide a phone number, we'll call to organise an appointment.
Sending