Exercise: Analyse the data sets using MS Analysis and
Clementine. For example,
Construct a decision-tree model for the cc2000 training data, and test
the model on the cc2000 evaluation data. Let CARAVAN be the target
(output) variable.
Construct a neural-net model for the cc2000 training data, and test
the model on the cc2000 evaluation data. Again, let CARAVAN be the target
(output) variable.
Compare the performance of the decision-tree and the neural-net
models.
Repeat the abovementioned, but using only a subset of the
variables (e.g. only MOSTYPE and CARAVAN). Which subset gives the best performance?
Construct a decision-tree model for the baskets training data, where
the variables value, income, and age are discretized into categories
(10-20, 20-30, 30-40, 40-50), (10K-15K, 15K-20K, 20K-25K, 25K-30K,
30K-35K), and (15-20, 20-25, 25-30, 30-35, 35-40, 40-45, 45-50,
50-55), respectively, and variable cardid is filtered out. Test the
model on the evaluation data. Let sex be the target
(output) variable.
Construct a neural-net model for the baskets data without the
abovementioned discretizations, but still filter out the cardid
variable. Test the model on the evaluation data. Again, let sex be the target
(output) variable.
Compare the performance of the decision-tree and the neural-net models.
What are the proportions of men and women that buy beer? Canned
food? Wine? Frozen meals? Confectionary?
Generate the rules for when the customer is a woman/man.
Try to make more similar investigations.
Make similar investigations of the ContraceptiveMethods data.