Go to the folder where you want to store your project, and clone the new repository:
The deployable jar for SAMOA will be in target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
.
If you want to compile SAMOA for S4, you will need to install the S4 dependencies manually as explained in Executing SAMOA with Apache S4.
The deployable jar for SAMOA will be in target/SAMOA-S4-0.0.1-SNAPSHOT.jar
.
If you want to test SAMOA in a local environment, simply clone the repository and install SAMOA.
The deployable jar for SAMOA will be in target/SAMOA-Local-0.0.1-SNAPSHOT.jar
.
If you want to compile SAMOA for S4, you will need to install the S4 dependencies manually as explained in Executing SAMOA with Apache S4.
~$wget "https://downloads.sourceforge.net/project/moa-datastream/Datasets/Classification/covtypeNorm.arff.zip"
~$unzip covtypeNorm.arff.zip
Forest Covertype contains the forest cover type for 30 x 30 meter cells obtained from the US Forest Service (USFS) Region 2 Resource Information System (RIS) data. It contains 581,012 instances and 54 attributes, and it has been used in several articles on data stream classification.
Classifying the CoverType dataset with the bagging algorithm
~$bin/samoa local target/SAMOA-Local-0.0.1-SNAPSHOT.jar "PrequentialEvaluation -l classifiers.ensemble.Bagging -s (ArffFileStream -f covtypeNorm.arff) -f 100000"
The output will be a list of the evaluation results, plotted each 100,000 instances.