You are performing exploratory analysis of files that are encoded in a complex proprietary format. The format
requires disk intensive access to several dependent files in HDFS.
You need to build an Azure Machine Learning model by using a canopy clustering algorithm. You must ensure
that changes to proprietary file formats can be maintained by using the least amount of effort.
Which Machine Learning library should you use?
A.
MicrosoftML
B.
scikit-learn
C.
SparkR
D.
Mahout