|   | 
     
       DM Project for CS240B (Revised CS240A Take Home final)
      Your final project is building an efficient Naive Bayesian classifier 
        for a dataset of your choice using WEKA. 
        For instance, uci/kdd and Weka are two good sources of data sets: 
      
      Good results were reported in the past with datasets such as led, 
        mushrooms, splice, titanic, waveform, abalone, letter, and census. But 
        data are continously being revised and upgraded and you are encouraged 
        to try new data sets. However make sure that your data set is not small, 
        otherwise your experiments with performance will not be interesting. 
      You are encouraged to try new applications and if have your own interesting 
        application, you should consider using it.  
      Your specific tasks are as follows (you should try to implement them 
        using clean and compact SQL)
      
        -  
          
Perform a preliminary analysis of your data and decide how you 
            are going to deal with missing values and wheter you are going to 
            discretize continuous values or you are going to assume and use a 
             
            Gaussian distribution or some other kind of distribution.
         
        -  Partition your data into two sets MS and PS. The set MS will be 
          used to build your classifier. The set PS will be used to predict its 
          accuracy by testing.
 
           
            
        - Derive your Naive Bayesian Classifier and determine its accuracy
 
           
            
        - Repeat the last step with other kinds of classifiers (e.g., decision 
          trees) 
 
           
            
        - [Ensemble-based bagging ] See if you can get better decisons by 
          using voting ensembles of classifiers (possibly by assigning weights 
          to the votes of each classifier).
 
           
            
        - Write a nice report decribing the various steps of analysis, implementation 
          and testing that you have performed and what you might have learned 
          in the course of this project. 
 
       
     |