Weka, Ruby, Association Mining
by Tyler on May.15, 2010, under Data Mining, Ruby, Weka
Weka is a collection of machine learning tools used for data mining. Weka is written in Java however it is possible to use Weka’s libraries inside Ruby. To do this, we must install the Java, Rjb, and of course obtain the Weka source code. In this example, I look at Associating Mining.
Refer to my previous example for setup instructions.
Running Weka
The following is an example of a frequent itemset finder.
require 'rjb'
#———————-def asscm()
#Load Java Jar
dir = “./weka.jar”
#Have Rjb load the jar file, and pass Java command line arguments
Rjb::load(dir, jvmargs=["-Xmx1000M"])
#make initial association Miner
obj = Rjb::import(“weka.associations.Apriori”)
assc= obj.new
#load the data using Java and Weka
weather_src = Rjb::import(“java.io.FileReader”).new(“weather.nominal.arff”)
weather_data = Rjb::import(“weka.core.Instances”).new(weather_src)
#Find the frequent itemsets
assc.setCar(true) #mines for class association
assc.setLowerBoundMinSupport(0.25) #set minimum support
assc.buildAssociations(weather_data)
puts assc.toString
end
#———————-
asscm()
We first tell Rjb to load the specified classpath, for us it’s our Jar file. I passed command line arguments that specify the amount of RAM to use.
Rjb::import loads specific classes. These are relative to our classpath.
I call the constructor for the new classes by using the .new method from Ruby. Afterward, I can use the new object like any other Ruby object. The method names are as they are found in their Java source files. For an explanation of data type conversions, click here.
The dataset I used is different from the previous because we can’t use numerical values. The dataset is found inside the data folder downloaded with weka.jar.
As I said before, this can be done in JRuby. A great example can be found at this great blog post, which inspired my post.

July 23rd, 2012 on 11:48 am
Great post, thank you. I am trying to integrate WEKA with my rails app. I keep getting java error:
ERROR retrieving feed: java.lang.NoClassDefFoundError: weka/clusterers/SimpleKMeans
Where do you put weka.jar file?
Thank you,
Ksenia
July 23rd, 2012 on 11:55 am
In one of my old class projects, I created a subfolder called “tools” in the rails root directory. I placed the Weka files there.
dir = “./tools/”
Rjb::load(dir, jvmargs=["-Xmx1000M"])
August 1st, 2012 on 11:10 am
Another question. Have you done anything with weka.core.Attribute. I am trying to do class evaluation and keep getting error ava.lang.IllegalArgumentException: Illegal pattern character ‘e’
Here is my code:
fastVector = Rjb::import(’weka.core.FastVector’)
attribute = Rjb::import(’weka.core.Attribute’)
inputClassVector = fastVector.new(2)
inputClassVector.addElement(”english”)
inputClassVector.addElement(”none_english”)
wekaClassAttribute = attribute.new(”class”, inputClassVector)
seems like either fastVector not correct, i get an error while trying to create new attribute
Thank you,
Ksenia
August 2nd, 2012 on 10:05 am
I think it’s parsing your second parameter as a string with a date format. Attribute has a few constructors and one of them is for date attributes. I’m not sure why it’s confusing the two lol. Try this:
wekaClassAttribute = attribute.new_with_sig(’Ljava.lang.String;Lweka.core.FastVector;’,”class”,inputClassVector)
new_with_sig calls new and specifies the signature. The first parameter specifies the exact signature we want, and the remaining are the arguments.