Showing posts with label Difference between DTP and Infopackage. Show all posts
Showing posts with label Difference between DTP and Infopackage. Show all posts



Dimension design: A different perspective


Pre-requisites: An infocube is already created and active, and filled will data, which will be used for analysis of dimension tables.

Dimension to Fact Ratio Computation: This ratio is a percentage figure of the number of records that exists in the dimension table to the number of records in fact table or what percentage of fact table size is a dimension table. Mathematically putting it down, the equation would be as below:

          Ratio = No of rows in Dimension table X 100 / No of rows in Fact Table

Dimension Table Design Concept: We have been reading and hearing over and over again that the characteristics should be added into a dimension if there exists a 1:1 or 1:M relation and they should be in separate dimension if there exists a M:M relation. What is this 1:1 or 1: M? This is the relation which the characteristics share among each other.
For instance if one Plant can have only one Storage Location and one storage location can belong to only one plant at any given point of time, then the relation shared among them is 1:1.
If 1 Functional Location can have many equipment but one equipment can belong to only one functional location then the relation shared between the functional location and Equipment is 1:M.
If 1 sales order can have many materials and one material can exist in different sales orders then there absolutely is no dependence among these two and the relation between these two is many to many or M: M.

Challenges in understanding the relationship: Often we SAP BI consultants depend on the Functional consultants to help us out with the relationship shared between these characteristics / fields. Due to time constraint we generally cannot dedicate time to educate the functional consultants on the purpose of this exercise, and it takes a lot of time to understand this relationship thoroughly.


Scenario: An infocube ZPFANALYSIS had few dimensions which were way larger than the preferred 20% ratio. This had to be redesigned such that the performance was under 20% ratio.
This ratio could be either manually derived by checking the number of entries in the desired dimension table (/BIC/D<infocube name><dimension number>) to the fact table (/BIC/F<Infocube Name> or /BIC/E<Infocube name>) or a program SAP_INFOCUBE_DESIGNS can be executed in SE38 which reports this ratio for all the dimensions, for all the infocubes in the system.

SAP_INFOCUBE_DESIGNS:
1.jpg
We can find from the report that the total number of rows in the fact table is 643850. Dimension 2 (/BIC/DZPFANLSYS2) has around 640430 rows, which is 99% (99.49%)of the fact table rows and Dimension 4(/BIC/DZPFANLSYS4) has around 196250 rows, which is 30%  (30.48%)of the fact table rows.

Infocube ZPFANLSYS:
ZPFANLSYS.jpg

Approach:

Step 1: Analysis of the dimension table /BIC/DZPFANALSYS2 to plan on reducing the number of records.
/BIC/DZPFANLSYS2
3.jpg

Fact table:
4.jpg
Dimension table holds 1 record more than the fact table.
View the data in the table /BIC/DZPFANLSYS2 (Table related to Dimension 2) in SE12 and sort all the fields. This sorting will help us spot the rows which have repeated values for many columns, which will eventually lead to understanding the relationship between the characteristics (columns in dimension table).
5.jpg

Identifying the relationships:
Once the sorting is done we need to look out for the number of values that repeat across the columns. All the records which repeat could have been displayed in a single row with one dimension id assigned if all the columns had same data. The repetition is a result of one or more columns which contribute a unique value to each row. Such columns if removed from the table then the number of rows in the table will come down.

In the below screenshot I’ve highlighted the rows in green that were repeating themselves with new dimension IDs, as only 2 columns SID_ZABNUM and SID_0NPLDA have new values for every row. These two columns having new values for every row have resulted in rest of the columns repeating themselves and in turn increasing the data size in the dimension table. Hence it can be easily said that these two columns do not belong in this dimension tables, so the related characteristics (ZABNUM and 0NPLDA) need to be removed out of this dimension.
Few rows could be found which repeat themselves for most of the columns, but have a new value once in a while for some columns, as highlighted in yellow in the below screenshot. This indicates that these columns share a 1:M relation with the rest of the columns with repeated rows and these could be left in the same dimension.
6.jpg
Conclusion: The columns marked in green belong to this dimension tables and the columns marked in red needs to be in other dimension tables.
7.jpg
Step 2: Create a copy infocube C_ZPFAN and create new dimensions to accommodate ZABNUM and 0NPLDA.
8.jpg
ZABNUM was added to dimension C_ZPFAN8 and 0NPLDA was added to C_ZPFAN7. These were marked as line item dimensions as they have only one characteristic under them.
Analysed the issue with dimension 4 in the similar way and changed other dimensions to help the situation.

Post changes, loaded the data into the copy infocube C_ZPFAN and found the number of records in the dimension table /BIC/DC_ZPFAN2 to be 40286.
9.jpg

Ratio: 40286 / 657400 * 100 = 6.12 %


SAP_INFOCUBE_DESIGNS:
10.jpg

Dimension2 of the copy infocube: /BIC/DC_ZPFAN2
11.jpg
Even now there a few repeated rows and columns, but the ratio is within 20%. We can create up to 13 dimensions, but it is always better to keep a dimension or two free for future enhancements.

Hope this was helpful.

By Mohanavel at scn.sap.com

Objective:    Introducing time delay in the process chain with help of standard SAP provided program RSWAITSEC.


Background:  With the use of interrupt process type in the process chain we can give the fixed delay to the subsequent process types when the interrupt step is reached.  But when we use the standard interrupt process type we have to mention the date and time or event name.
In many cases interrupt step might not help, if suppose an interrupt step is introduce to delay the subsequent processes by a definite period of time, and if all the steps above to the interrupt gets completed early then instead of passing the trigger to the subsequent step after the desired wait time, the interrupt will force the chain to wait till the conditions in the interrupt are satisfied.
In order to achieve the delay in the trigger flow from one process type to another in a process chain without any condition for the fixed time limit or event raise we can use this RSWAITSEC Program.


Scenario:   In our project one of the master data chain is getting scheduled at 23.00IST. This load supplies data to the report which is based on 0CALWEEK. The data load and an abap program in the process chain make use of SY-DATUM, so a load that starts on sunday 23:00 if doesn't complete by 23:59:59 hours (1 hour duration) then the entire data gets wrongly mapped to the next week. .This will cause discrepancy of data.
So it was required to schedule chain at 23:00 IST everyday except sunday, and at 22:45:00IST (15mins earlier) on Sundays.


Different Ways to Achieve the above Situation:
1.        Creating two different process chains and scheduling the 1st process chain at 23.00 IST for Monday to Saturday (Using factory calendar), scheduling the 2nd at 22.45 IST only for Sunday.
Disadvantage of 1st method:
Unnecessarily creating two chains for the same loads, this makes way to have multiple chains in the system.


   2.   Scheduling the same chain at 22.45IST and adding the decision step to find and give interruption of 15mins for Monday to Saturday.  So on Sunday it will  get start at 22.45.


Process Chain with Interrupt Process Type:

Description: PC with Interrpt Step(1).jpg
Disadvantage of 2nd method:
If suppose you want execute this chain with immediate in other time, then my interruption step will wait until 23.00 IST to get start the load for Monday to Saturday loads.


Better Way of Achieving with RSWAITSEC program:


Scheduling the chain at 22.45 and adding the decision step to find whether its Sunday or other.  If Sunday then next step would be directly to the local chain, if the particular day is between Monday to Saturday then the next step would be with RSWAITSEC program(SAP std program).   In the program variant we have to mention the desired time delay in Secs(900 Secs).

     Compared to above two methods, this will be the  better way to achieve the desired output.  Even though if I run the process chain with start process as         immediate on other than Sunday’s my local chain will not wait until 23.00IST to reach, it will wait for 15mins and it will get triggered. 




       As this is the SAP provided one no need to move any TP for this, even in production we will be able to use directly.



Process chain with RSWAITSEC:
Description: PC with RSWAITSEC program(2).jpg


ABAP Process type with RSWAITSEC Program(which shown in the above PC):


Description: Program Variant for RSWAITSEC(3).jpg


Setting the Variant value (required time):


In the variant value we need to mention the desired limit of delay in Seconds.  My requirement is of with 15mins of delay, so I have given 900sec in the variant value.


Description: Variant Value Screen(4).jpg



So we can use this program in any stages of process chain to give the fixed period of delay.


Hope this will be helpful.


All about Data Transfer Process (DTP) - SAP BW 7

Credits: Bhushan Raval 

Data Transfer Process (DTP)


DTP determines the process for transfer of data between two persistent/non persistent objects within BI.
As of SAP NetWeaver 7.0, InfoPackage loads data from a Source System only up to PSA. It is DTP that determines the further loading of data thereafter.



Use
  • Loading data from PSA to InfoProvider(s).
  • Transfer of data from one InfoProvider to another within BI.
  • Data distribution to a target outside the BI system; e.g. Open HUBs, etc.

In the process of transferring data within BI, the Transformations define mapping and logic of data updating to the data targets whereas, the Extraction mode and Update mode are determined using a DTP.

NOTE: DTP is used to load data within BI system only; except when they are used in the scenarios of Virtual InfoProviders where DTP can be used to determine a direct data fetch from the source system at run time.


Key Benefits of using a DTP over conventional IP loading
  1. DTP follows one to one mechanism between a source and a Target i.e. one DTP sources data to only one data target whereas, IP loads data to all data targets at once. This is one of the major advantages over the InfoPackage method as it helps in achieving a lot of other benefits.
  2. Isolation of Data loading from Source to BI system (PSA) and within BI system. This helps in scheduling data loads to InfoProviders at any time after loading data from the source.
  3. Better Error handling mechanism with the use of Temporary storage area, Semantic Keys and Error Stack.


Extraction
There are two types of Extraction modes for a DTP – Full and Delta.



Full:


Update mode full is same as that in an InfoPackage.
It selects all the data available in the source based on the Filter conditions mentioned in the DTP.
When the source of data is any one from the below InfoProviders, only FULL Extraction Mode is available.
  • InfoObjects
  • InfoSets
  • DataStore Objects for Direct Update

Delta is not possible when the source is anyone of the above.


Delta:

                    
Unlike InfoPackage, delta transfer using a DTP doesn’t require an explicit initialization. When DTP is executed with Extraction mode Delta for the first time, all existing request till then are retrieved from the source and the delta is automatically initialized.Delta.jpg

The below 3 options are available for a DTP with Extraction Mode: Delta.
  • Only Get Delta Once.
  • Get All New Data Request By Request.
  • Retrieve Until No More New Data.


     I      Only get delta once:
If this indicator is set, a snapshot scenario is built. The Data available in the Target is an exact replica of the Source Data.
Scenario:
Let us consider a scenario wherein Data is transferred from a Flat File to an InfoCube. The Target needs to contain the data from the latest Flat File data load only. Each time a new Request is loaded, the previous request needs to be deleted from the Target. For every new data load, any previous Request loaded with the same selection criteria is to be removed from the InfoCube automatically. This is necessary, whenever the source delivers only the last status of the key figures, similar to a Snap Shot of the Source Data.
Solution – Only Get Delta Once
A DTP with a Full load should suffice the requirement. However, it is not recommended to use a Full DTP; the reason being, a full DTP loads all the requests from the PSA regardless of whether these were loaded previously or not. So, in order to avoid the duplication of data due to full loads, we have to always schedule PSA deletion every time before a full DTP is triggered again.

‘Only Get Delta Once’ does this job in a much efficient way; as it loads only the latest request (Delta) from a PSA to a Data target.
  1. Delete the previous Request from the data target.
  2. Load data up to PSA using a Full InfoPackage.
  3. Execute DTP in Extraction Mode: Delta with ‘Only Get Delta Once’ checked.

The above 3 steps can be incorporated in a Process Chain which avoids any manual intervention.


     II     Get all new data request by request:
If you set this indicator in combination with ‘Retrieve Until No More New Data’, a DTP gets data from one request in the source. When it completes processing, the DTP checks whether the source contains any further new requests. If the source contains more requests, a new DTP request is automatically generated and processed.

NOTE: If ‘Retrieve Until No More New Data’ is unchecked, the above option automatically changes to ‘Get One Request Only’. This would in turn get only one request from the source.
Also, once DTP is activated, the option ‘Retrieve Until No More New Data’ no more appears in the DTP maintenance.



Package Size

The number of Data records contained in one individual Data package is determined here.
Default value is 50,000.
  
 

Filter

  
The selection Criteria for fetching the data from the source is determined / restricted by filter.filter.jpg

We have following options to restrict a value / range of values:

   Multiple selections

   OLAP variable

   ABAP Routine

Acheck.jpg on the right of the Filter button indicates the Filter selections exist for the DTP.




Semantic Groups

Choose Semantic Groups to specify how you want to build the data packages that are read from the source (DataSource or InfoProvider). To do this, define key fields. Data records that have the same key are combined in a single data package.
This setting is only relevant for DataStore objects with data fields that are overwritten. This setting also defines the key fields for the error stack. By defining the key for the error stack, you ensure that the data can be updated in the target in the correct order once the incorrect data records have been corrected.

Acheck.jpgon the right side of the ‘Semantic Groups’ button indicates the Semantic keys exist for the DTP.
  
  

Update
update.jpg


Error Handling


  • Deactivated:
If an error occurs, the error is reported at the package level and not at the data record level.
The incorrect records are not written to the error stack since the request is terminated and has to be updated again in its entirety.
This results in faster processing.

  • No Update, No Reporting:
If errors occur, the system terminates the update of the entire data package. The request is not released for reporting. The incorrect record is highlighted so that the error can be assigned to the data record.
The incorrect records are not written to the error stack since the request is terminated and has to be updated again in its entirety.

  • Valid Records Update, No Reporting (Request Red):
This option allows you to update valid data. This data is only released for reporting after the administrator checks the incorrect records that are not updated and manually releases the request (by a QM action, that is, setting the overall status on the Status tab page in the monitor).
The incorrect records are written to a separate error stack in which the records are edited and can be updated manually using an error DTP.

  • Valid Records Update, Reporting Possible (Request Green):
Valid records can be reported immediately. Automatic follow-up actions, such as adjusting the aggregates, are also carried out.
The incorrect records are written to a separate error stack in which the records are edited and can be updated manually using an error DTP.



Error DTP

Erroneous records in a DTP load are written to a stack called Error Stack.
Error Stack is a request-based table (PSA table) into which erroneous data records from a data transfer process (DTP) are written. The error stack is based on the data source (PSA, DSO or Info Cube), that is, records from the source are written to the error stack.
In order to upload data to the Data Target, we need to correct the data records in the Error Stack and manually run the Error DTP.


Execute
Execute.jpg



Processing Mode

Serial Extraction, Immediate Parallel Processing:
A request is processed in a background process when a DTP is started in a process chain or manually.

  
 

Serial in dialog process (for debugging):
A request is processed in a dialog process when it is started in debug mode from DTP maintenance.
This mode is ideal for simulating the DTP execution in Debugging mode. When this mode is selected, we have the option to activate or deactivate the session Break Points at various stages like – Extraction, Data Filtering, Error Handling, Transformation and Data Target updating.
You cannot start requests for real-time data acquisition in debug mode.

Debugging Tip:
When you want to debug the DTP, you cannot set a session breakpoint in the editor where you write the ABAP code (e.g. DTP Filter). You need to set a session break point(s) in the Generated program as shown below:

Execute.jpg


No data transfer; delta status in source: fetched:
This processing is available only when DTP is operated in Delta Mode. It is similar to Delta Initialization without data transfer as in an InfoPackage.
In this mode, the DTP executes directly in Dialog. The request generated would mark the data found from the source as fetched, but does not actually load any data to the target.
We can choose this mode even if the data has already been transferred previously using the DTP.
  
  

Delta DTP on a DSO
There are special data transfer options when the Data is sourced from a DTP to other Data Target.

DSO.jpg

  • Active Table (with Archive)
       The data is read from the DSO active table and from the archived data.

  • Active Table (Without Archive)The data is only read from the active table of a DSO. If there is data in the archive or in near-line storage at the time of extraction, this data is not extracted.

  • Archive (Full Extraction Only)The data is only read from the archive data store. Data is not extracted from the active table.

  • Change Log
    The data is read from the change log and not the active table of the DSO.