AMIETE – IT (OLD SCHEME)

 

Code: AT19                                       Subject: DATA WAREHOUSING AND DATA MINING

Flowchart: Alternate Process: JUNE 2010

Time: 3 Hours                                                                                                     Max. Marks: 100

 

 

NOTE: There are 9 Questions in all.

·      Question 1 is compulsory and carries 20 marks. Answer to Q.1 must be written in the space provided for it in the answer book supplied and nowhere else.

·      Out of the remaining EIGHT Questions, answer any FIVE Questions. Each question carries 16 marks.

·      Any required data not explicitly given, may be suitably assumed and stated.

 

 

Q.1       Choose the correct or the best alternative in the following:                                 (2 10)

 

       a.     Which data mining technique is used to find correlation among given data?

 

               (A) Association rule mining                      (B) Classification

               (C) Clustering                                          (D) Prediction

 

       b.     Which schema is not used in data warehousing?

 

               (A) Star                                                   (B) Fact constellation

               (C) Snowflake                                         (D) Hybrid schema

 

       c.      In OLAP, which property is satisfied?

 

               (A) Operational Processing Characteristic                                                                  

               (B) Transaction Orientation

               (C) Transaction Throughput Metric         

               (D) Historical Data used

 

       d.     Which classification method does not involve sharp cut off for continuous attribute?

 

               (A) Decision tree induction                       (B) Rule based classification

               (C) Fuzzy set approach                            (D) Bayesian classification

 

       e.      If class label is not known, which data mining technique is used?

 

               (A) Classification                                     (B) Clustering

               (C) Data pre-processing                          (D) Data cleaning

 

       f.      Which method is classified as a lazy learning?

 

               (A)  Fuzzy set approach                           (B) Genetic algorithm

               (C)  Case base reasoning                         (D) Rough set approach.

    

       g.      What is a subset of a data warehouse in which only a focused portion of the data warehouse information is kept?

 

              (A) Data mining tool                                (B) Data mart

               (C) Data warehouse                                (D) None of the above

 

      h.      Which of the following is a logical collection of data gathered from many databases and used to create business intelligence?

 

               (A) Competitive intelligence system     

               (B) Artificial intelligence

               (C) External intelligence gathering 'bots     

               (D) Data warehouse

 

       i.       Data warehouses are queried using:

 

               (A) Data-mining tools.                             (B) Picks and shovels.

               (C) Database management systems.         (D) Data marts.

 

       j.      In which mining system unstructured data type and opportunistic search mode is used?

 

               (A) Data mining.                                      (B) Text mining

               (C) Information retrieval                           (D) Data retrieval

 

 

Answer any FIVE Questions out of EIGHT Questions.

Each question carries 16 marks.

 

 

  Q.2     a.   What is a data warehouse? Explain the characteristics of data warehouse?          (8)

 

             b.   Explain star and snowflake schema using example.                                              (8)

 

  Q.3     a.   Give the difference between OLAP vs. OLTP.                                                   (7)

 

             b.   Explain the following:                                                                                         (9)

                  (i)   Drill-down analysis                        

                  (ii)  Data mart                                      

                  (iii) Virtual data warehouse                                                                                                             .                                                           (8)

 

  Q.4     a.   Explain data transformation with following                                                         (10)

                   (i)   Smoothing                                    

                   (ii)  Aggregation                                  

                   (iii) Generalization                               

                   (iv) Normalization                               

                   (v)  Attribute Construction

 

             b.   Explain how to handle missing value in the data cleaning process.                        (6)

 

  Q.5     a.   What is EIS? Explain its uses.                                                                            (8)

 

             b.   List common data quality problems that should be resolved by the preparation and integration phases.                                                                    (4)

 

             c.   What is the primary objective in managing the refresh process for a data warehouse?                      (4)


 

  Q.6     a.   What is Machine Learning? Discuss in brief the role of Machine Learning.           (9)

 

             b.   List out the seven applications of Machine Learning and their aspects.                  (7)

 

  Q.7     a.   Why are decision tree classifiers so popular? Explain.                                         (8)

 

             b.   What is external / unstructured data? Explain how such data is stored in a data warehouse.                                                                        (8)

 

  Q.8     a.   Illustrate one data mining issue that, in your view, may have a strong impact on the market and on society.                                                                              (10)

 

             b.   How is a data warehouse different from a database? How are they similar?       (6)

 

  Q.9     a.   Why is a feedback loop important for the success of data warehouse implementation?                    (6)

 

             b.   Explain in brief data migration methodology with the help of a block diagram.                          (10)