Code: AT-19                                      Subject: DATA WAREHOUSING AND DATA MINING Flowchart: Alternate Process: JUNE 2007

Time: 3 Hours                                                                                                     Max. Marks: 100

 

NOTE: There are 9 Questions in all.

·      Question 1 is compulsory and carries 20 marks. Answer to Q. 1. must be written in the space provided for it in the answer book supplied and nowhere else.

·      Out of the remaining EIGHT Questions answer any FIVE Questions. Each question carries 16 marks.

·      Any required data not explicitly given, may be suitably assumed and stated.

 

 

Q.1       Choose the correct or best alternative in the following:                                         (2x10)

 

       a.      PCA is a technique used for                  

               (A) Mining patterns                                (B) Reducing data

               (C) Integrating data                                (D) Cleaning data

 

       b.     CHAID stands for

               (A) Chi Automatic Interaction Detection

               (B) Chi Square Automatic Integration Detection

               (C) Chi Square Automatic Interaction Detection

               (D) None of these

 

       c.      What types of data are being mined

(A) Data that are mined pertain to only individuals                                                     

(B) Data that are mined pertain to only businesses

 (C) Data that are mined pertain to only natural events or conditions                           

 (D) All of the above

 

       d.     Data cleansing is defined as

(A) The process of ensuring that all values in a dataset are inconsistent and

       incorrectly recorded.                      

(B) The process of ensuring that all values in a dataset are inconsistent and

       correctly recorded

(C) The process of ensuring that all values in a dataset are consistent and

       correctly recorded.                         

(D) None of these

 

       e.      What is data mining?

(A) The extraction of hidden un-predictable information from large databases.

(B) The extraction of obvious predictive information from large databases.

               (C) The extraction of hidden predictive information from large databases.

               (D) None of these

 

       f.      What is data warehouse?

(A)   A system for delivering massive quantities of data.                                               

(B) A system for storing massive quantities of data.

(C) A system for storing and delivering massive quantities of data.                              

(D) A system connecting large number of databases.

g.      What is multidimensional database?

(A) A database designed for offline analytical processing. Structured as a multidimensional hypercube with one axis per dimension.

(B)    A database designed for on-line analytical processing. Structured as a      

       multidimensional hypercube with one axis per dimension.

(C)    A database structured as a multidimensional hypercube with multiple axes

       per dimension.

(D) A database structured as a one dimensional array with multiple axes.

 

       h.      OLTP stands for

               (A) online transaction processing systems

               (B) offline transaction processing systems

               (C) online transaction systems

               (D) online table processing

 

       i.       Advantages of using data warehouse.

(A)   Enhances end-user access to a wide variety of data.                                            

(B) Enhances end-user access to a narrow variety of data.

(C) Business decision makers can obtain various kinds of trend reports.

(D) Both (A) and (C)

 

       j.      Which one of the following is not an OLAP Tool

               (A) HOLAP                                           (B) DOLAP

               (C) MOLAP                                          (D) ROLAP

 

 

 

Answer any FIVE Questions out of EIGHT Questions.

Each question carries 16 marks.

 

Q2.  a.    From the architecture point of view what are the three data warehouse models?      (8)

 

        b.    What is the purpose and results of monitoring the data warehouse environment?      (8)

 

 

Q3.  a.    What is Data integration?                                                                                        (8)

 

        b.    Briefly describe any two methods to generate concept hierarchy.                            (8)

 

 

Q4.  a.    Define data reduction. List any four strategies for data reduction.                           (8)

 

        b.    List the challenges in the naturally evolving architecture. Briefly describe any one of them.                                           (8)

 

 

Q5.  a.    What is scope of relevance analysis for classification and prediction? Describe the attribute selection measure applied in decision tree induction.                                                                           (8)

 

        b.    Generate frequent item sets for the  transaction data in the following table. Assume the minimum support count=3.                                                                                                               (8)

 

TID

List of item_IDs

T100

A, B, E

T200

B, D

T300

B, C

T400

A, B, D

T500

A, C

T600

B, C

T700

A, C

T800

A, B, C, E

T900

A, B, C

 

 

Q6.  a.    What is meta data? How does it facilitate the use of external data?                          (8)

 

       b.    What are the three kinds of data warehouse applications?                                       (8)

 

 

Q7.  a.    What are cuboids? Use example to illustrate its use in data warehouse implementation.            (8)

 

        b.    Explain the difference between a migration plan and a methodology. Why do methodologies fail during implementation?                                                                                                    (8)  

 

 

Q8.  a.    Define Executive Information System? What are advantages and disadvantages of EIS?          (8)

 

       b.    What are different components of EIS? Discuss these components in brief.             (8)

 

 

Q9.  a.    Define the terms  - Associative Rule, support, confidence.                              (4+2+2)

 

        b.    Briefly explain the criteria to compare any two classification methods.                     (8)