[SQL][MCSA]70-463 考試預備 ( 4/4 )
Star Schema
- There is a single central table, called a fact table, surrounded by multiple tables called dimensions.
- An enterprise data warehouse covers multiple business areas and consists of multiple Star (and/or Snowflake) schemas.
- Dimensions with connections to multiple fact tables are called shared or conformed dimensions
Snowflake Schema
- Snowflake schema is a logical arrangement of tables in a multidimensional database resembles a snowflake. The snowflake schema is represented by centralized fact tables which are connected to multiple dimensions.
- If you do not use OLAP cubes and your reports query your data warehouse directly, then using a Star instead of a Snowflake schema might speed up the reports, because your reporting queries involve fewer joins.
Granularity Level
- The number of dimensions connected with a fact table defines the level of granularity of analysis you can get
Implementing a Data Warehouse
- Because in a data warehouse you use data primarily in read-only mode, the Simple recovery model is the most appropriate for a data warehouse.
- In your data warehouse, large fact tables typically occupy most of the space. You can optimize querying and managing large fact tables through partitioning. For most DW scenarios, having one filegroup for each partition is the most appropriate.
- You can create a view with a query that joins and aggregates data. Then you can create a clustered index on the view to get an indexed view.
Data Flow
- Many SSIS objects have a ValidateExternalMetadata property that you can set to False if the object being referenced (such as a table) does not exist when the package is being designed. This property is most commonly used for source or destination adapters when, for example, a destination table is created during package execution.
- You can configure the OLE DB destination adapter (and now, with SQL Server 2012) to insert data from the data flow through bulk batches of data, instead of one row at a time. To use this destination-optimization technique for OLE DB, edit the OLE DB destination and set the data access mode to Table Or View—Fast Load.
Package
- The intended purpose of SSIS parameters is to supply values that either should or must be determined outside the SSIS process—mostly because they cannot be determined automatically. The pivotal purpose of SSIS variables, on the other hand, is to improve reusability and optimize data retrieval. SSIS variables are therefore not subject to the same sort of restrictions as SSIS parameters.
- Transactions in SSIS use the Microsoft Distributed Transaction Coordinator (MSDTC); the MSDTC service needs to be started on the computer for transactions to work.
- At times, you might want to enable a transaction for a container but exclude some of the tasks within the container. For example, if you have a couple of Execute SQL tasks that are used for auditing purposes in a container and the TransactionOption property for the container is set to Required, the logging tasks will also be rolled back if an error occurs. To prevent the auditing tasks from rolling back, set the TransactionOption property for those tasks to Not Supported. This will still let the other tasks in the container be in the transaction, but it will not include the auditing tasks as part of the transaction.
- You can configure a package to start from the point of failure or from an earlier step when the package is rerun. In SSIS, this configuration process is called adding checkpoints.If you set the CheckpointUsage property to Always, the checkpoint file must be present or the package will not start. In addition, using checkpoints is not allowed if you have set the TransactionOption of the package to Required.