DECEMBER 2006

 

Code: T-19                                         Subject: DATA WAREHOUSING AND DATA MINING

Time: 3 Hours                                                                                                     Max. Marks: 100

 

NOTE: There are 9 Questions in all.

·      Question 1 is compulsory and carries 20 marks. Answer to Q. 1. must be written in the space provided for it in the answer book supplied and nowhere else.

·      Out of the remaining EIGHT Questions answer any FIVE Questions. Each question carries 16 marks.

·      Any required data not explicitly given, may be suitably assumed and stated.

 

Q.1       Choose the correct or best alternative in the following:                                         (2x10)

       

a.       Linear discriminant analysis is used in 

 

                   (A) data reduction.                              (B) data compression.                                          

(C)   data cleaning.                                (D) data pre-processing.

       

b.      Concept of Boosting applies to

 

(A)    Predictive data mining.                  (B) Data warehousing.

(C) Statistical learning.                         (D) Learning.

            

             c.   Data warehousing

                  

(A) refers to storing data offline at a separate site. 

(B) is backing up data regularly.

                   (C) is related to data mining.               

(D) uses tape as opposite to disk.

 

             d.   Which of the following is not true for a data warehouse?

 

(A)   It is a database designed for analytical tasks, using data from multiple applications.

(B)   It does not support decision-making.

(C) It’s contents are periodically updated.

(D) It contains current and historical data.

 

             e.   Which of the following is not an OLAP operation?

                  

(A)     slice                                             (B) dice

(C)  roll-up                                          (D) union

 

             f.    Star schema in a data warehouse consists of 

 

(A)     dimension tables.                         

(B)     fact tables.

(C)     dimension tables and a fact table. 

(D)    no tables.

       

 

             g.   OLAP stands for

 

(A)     Off-line analytical processing.       (B) On-line analytical processing.

(C) On-line analytical process.             (D) On-line analysis process.         

 

             h.   A data cube

 

(A)    Gives a single dimensional view of data.

(B)    Gives a multidimensional view of data.

(C)    Is not used for viewing data.

(D)    Is not used for data modelling.

 

             i.    The possible designs of data mining system architecture are

 

(A)   no coupling.                                  (B) loose coupling.

(C) semi tight & tight coupling              (D) all of the above.

 

             j.    Which of the following is an OLAP tool?

 

(A)     ROLAP                                         (B) MOLAP

(C)  HOLAP                                         (D) all of the above.

      

 

 

Answer any FIVE Questions out of EIGHT Questions.

Each question carries 16 marks.

 

  Q.2     a.   Discuss the evolution of decision support systems. How are they inadequate at providing strategic information to decision makers?           (8)

 

             b.   What is data mining? How does data mining differ from traditional database access?                       (8)

 

  Q.3     a.   Bring out the differences between a star schema and a snowflake schema with the help of an example. Which design is preferred and why?                (6)

       

             b.  Explain the two methods of lossy data compression.                                            (6)          

       

c.       Explain the following OLAP operations:

                                                                             i.      Rollup.

                                                                            ii.      Drill Down.                                                                                (4)

 

  Q.4     a.   Explain discovery derives exploration of data cubes.                                           (8)

                 

             b.   Explain how EIS (Executive Information System) is supported by the data warehouse.                    (8)

 

  Q.5     a.   Why does every structure in the data warehouse contain the time element?          (5)

 

             b.   Distinguish between data warehouses and data marts.                                         (5)

 

             c.   Enumerate the building blocks of a data warehouse.                                            (6)

 

  Q.6     a.   What is external data? Why should it be compared to internal data over a period of time? Explain how the comparison is done.                             (8)

 

             b.   What is the criterion for classification of Association rules?                                 (8)

 

  Q.7     a.   Data integration is more important in a data warehouse than in an operational system. Explain.                                                                   (8)

 

             b.   Enumerate five different steps for data transformation, into appropriate form for mining.                   (8)

 

  Q.8     a.   Explain the feedback loop between the data architect and the decision support system analyst.                                                                   (5)

 

 

                           b.  How can decision tree induction be integrated with data warehousing

                  techniques for data mining.                                                                                  (5)

 

c.  Explain the steps that should be applied to data for classification and

                                                                      prediction.                                                         (6)

 

  Q.9           Write short notes on the following:

 

(i)                  Dimensional modelling.

(ii)                Event mapping.

(iii)               4GL technology.

(iv)              OLAP systems.                                                               (4x4=16)