Weka is a collection of machine learning tools used for data mining. Weka is written in Java however it is possible to use Weka’s libraries inside Ruby. To do this, we must install the Java, Rjb, and of course obtain the Weka source code. In this example, I look at Associating Mining.
The following is an example of a frequent itemset finder.
#Load Java Jar
dir = “./weka.jar”
#Have Rjb load the jar file, and pass Java command line arguments
#make initial association Miner
obj = Rjb::import(“weka.associations.Apriori”)
#load the data using Java and Weka
weather_src = Rjb::import(“java.io.FileReader”).new(“weather.nominal.arff”)
weather_data = Rjb::import(“weka.core.Instances”).new(weather_src)
#Find the frequent itemsets
assc.setCar(true) #mines for class association
assc.setLowerBoundMinSupport(0.25) #set minimum support
We first tell Rjb to load the specified classpath, for us it’s our Jar file. I passed command line arguments that specify the amount of RAM to use.
Rjb::import loads specific classes. These are relative to our classpath.
I call the constructor for the new classes by using the .new method from Ruby. Afterward, I can use the new object like any other Ruby object. The method names are as they are found in their Java source files. For an explanation of data type conversions, click here.
The dataset I used is different from the previous because we can’t use numerical values. The dataset is found inside the data folder downloaded with weka.jar.
As I said before, this can be done in JRuby. A great example can be found at this great blog post, which inspired my post.