Pages

Monday 1 September 2014

The following are the different phases involved in a ETL project development life cycle.

The following are the different phases involved in a ETL project development life cycle.

1) Requirement Gathering
a) Business Requirement Collection (BRD)
--> The business requirement gathering start by
Business analyst, onsite technical lead and client business users.
--> In this phase, a business analyst prepares business requirement document (BRD) or business requirement specification (BRS)
--> Br collection takes place at client location.
--> The o/p from BR analyses are
--> BRS business requirement specifications
--> SRS system requirement specifications
b) System Requirement Collection (SRD)
Senior technical people (or) ETL architect will prepare the SRS which contains s/w and h/w details
the SRS will includes
a) o/s to be used (windows or Unix)
b) rdbms required to build database (oracle, teradata etc)
c) etl tools required (Informatica, data stage)
d) olap tools required (cognos, bo)
the srs is also called as technical requirement specifications (trs)

2) Design Phase
a) High level design document (HLD)
an etl architect and dwh architect participate in designing a solution to build a dwh.
an HLD document is prepared based on business requirement.
b) Low level design document (LLD)
based on HLD, a senior etl developer prepare  low level design document
the lld contains more technical details of an etl system
an lld contains data flow diagram (dfd), details of source and targets of each mapping.
an lld also contains information about full and incremental load.
after lld then development phase will start.
3) Development Phase
a) Mapping design
--> based an lld, the etl team will create mapping(etl code)
--> after designing the mappings, the code (mappings) will be reviewed by developers.
b) code review
--> code review will be done by developer.
--> in code review, the developer will review the code and the logic but not the data.
--> the foll. activities takes place in code review
--> you have to check the naming standards of transformation, mappings of data etc.
--> source and target mapping (placed the correct logic or not in mapping)
c) peer review
--> the code will reviewed by your team member (third party developer)
4) Testing
a) unit testing
--> A unit test for the dwh is a white box testing, it should check the etl procedure and mappings.
--> the foll. are the test cases can be executed by an etl developer.
1) verify data loss
2) no. of records in the source and target
3) data load / insert
4)data load /update
5) Incremental load
6) Data accuracy
7) Verify naming standards
8) Verify column mapping
--> The unit test will be carried by etl developer in development phase.
-- > ETL developer has to do the data validations also in this phase.

b) Development integration testing
--> run all the mappings in the sequence order.4
--> first run the source to stage mappings
--> then run the mappings related to dimensions and facts.
c) System integration testing
--> after development phase, we have to move our code to QA environment.
--> in this environment, we are giving read only permissions to testing people.
--> they will test all the workflows.
--> and they will test our code according to their standards.
d) User acceptance testing (UAT)
--> This test is carried out in the presence of client side technical users to verify the data migration from source to destination.
5) Pre-production
6) Production (go-live)

Regards,
Venkat

Could anyone please tell me how to expalin an informatica project in interview..?

Could anyone please tell me how to expalin an informatica project in interview..
What      are your Daily routines?
            Cheking any imp mails are came.Need to disucss with Team lead if we have to do anywork. By EOD send the mail to team lead regarding work. need to attend weekly status meeting.

How      many mapping have you created all together in your project?

      As of now i did 3 projects first project : 48 workflows , second : 52 workflow , third project : 156 worlfows as of now.

In      which account does your Project Fall?
          account???
         hope its manufacturing and advanced services.

What      is your Reporting Hierarchy?

         me -->team lead--->Project Manager--->programming Manager
         me---> AR--->HR

How      many Complex Mapping’s have you created? Could you please me the situation      for which you have developed that Complex mapping?
          i think 8 , one for character by character  comparision (with informatica not possible) so written 950 lines if single SQL code. and one for Dynamical Hierarchy distribution .... and one for Multi byte characters ( japanes,chines,..etc) and so on...
      What is your Involvement in Performance tuning of your      Project?
   some times any way performance team will take care of my code.
What is the Schema of your Project? And why did you opt      for that particular schema?
     some_Prd --->that is the souce for us

What are your Roles in this project?

          as a developer , desing the workflows , unit testing, and so on...


Can I have one situation which you have adopted by      which performance has improved dramatically?
         yes ..... my firest project ..i faced perormance issue after go live.

Where you Involved in more than two projects      simultaneously?
         of couse ,, i involved 3 projects at a time.

Do you have any experience in the Production support?

         no

What kinds of Testing have you done on your Project      (Unit or Integration or System or UAT)? And Enhancement’s were done after      testing?
        Unit testing and sometimes integration. UAT will done by business not BI team.
How many Dimension Table are there in your Project and      how are they linked to the fact table?

         current project 18 and 2 facts . and relationship will always be dimention keys.


How do we do the Fact Load?
           loding the fact table ??
      after dimention load complete ..

How did you implement CDC in your project?

        change data capture ??
        It is always on souce modification date.


How does your Mapping in File to Load look like?

            souce --->ods--->flat file ( .dat) -- e cap delimeter

How does your Mapping in Load to Stage look like?

         is depends on project ...
          Souce -->ods--->stage

How does your Mapping in Stage to ODS look like?

          Stage to ods ???
         never ...

What is the size of your Data warehouse?
           10 TB
What is your Daily feed size and weekly feed size?
        feed size>???
Which Approach (Top down or Bottom Up) was used in      building your project?
            Bottom up ---i mean you are asking about dimention to fact and fact to dimention??
How do you access your source’s (are they Flat files or      Relational)?

             Relational and some times flat files also.

Have you developed any Stored Procedure or triggers in      this project? How did you use them and in which situation?

            no

Did your Project go live? What are the issues that you      have faced while moving your project from the Test Environment to the      Production Environment?

         yeah...i faced some issues.

What is the biggest Challenge that you encountered in      this project?

        Dynamic hierachy data distribution and moving the files from unix box to informatica directory through shell scriopt , need to clean the data in flatfile itselt.


What is the scheduler tool you have used in this      project? How did you schedule jobs using it?

          Dollar Universe.. by using $U Sessioin task.