Saturday, May 25, 2013

How to generate confusion matrix visualization in python and how to use it in scikit-learn

Confusions matrix are quite useful to understand your classifier problems. scikit-learn allow you to retrieve easily the confusion matrix (metric.confusion_matrix(y_true, y_pred)) but it is hard to read.
An image representation is a great way to look at it like this.


From a confusion matrix, you can derive classification error, precision, recall and extract confusion highlights. mlboost has a simple util class ConfMatrix to do all of this now. Here is an example:

 from mlboost.util.confusion_matrix import ConfMatrix  
 clf.fit(X_train, y_train)  
 pred = clf.predict(X_train)  
 labels = list(set(y_train))  
 labels.sort()  
 cm = ConfMatrix(metrics.confusion_matrix(y_train, pred), labels)  
 cm.save_matrix('conf_matrix.p')  
 cm.get_classification()  
 cm.gen_conf_matrix('conf_matrix')  
 cm.gen_highlights('conf_matrix_highlights')  

Thursday, May 16, 2013

The simplest python server example ;)

Today, I was looking for a simple python server template but couldn't find a good one so here is what I was looking for (yes the title is a little arrogant ;):

#!/usr/bin/env python
''' simple python server example; 
    output format supported = html, raw or json '''
import sys
import json
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer

FORMATS = ('html','json','raw')
format = FORMATS[0]

class Handler(BaseHTTPRequestHandler):

    #handle GET command
    def do_GET(self):
        if format == 'html':
            self.send_response(200)
            self.send_header("Content-type", "text/plain")
            self.send_header('Content-type','text-html')
            self.end_headers()
            self.wfile.write("body")
        elif format == 'json':
            self.request.sendall(json.dumps({'path':self.path}))
        else:
            self.request.sendall("%s\t%s" %('path', self.path))
        return

def run(port=8000):

    print('http server is starting...')
    #ip and port of server
    server_address = ('127.0.0.1', port)
    httpd = HTTPServer(server_address, Handler)
    print('http server is running...listening on port %s' %port)
    httpd.serve_forever()

if __name__ == '__main__':
    from optparse import OptionParser
    op = OptionParser(__doc__)

    op.add_option("-p", default=8000, type="int", dest="port"
                  help="port #")
    op.add_option("-f", default='json', dest="format"
                  help="format available %s" %str(FORMATS))
    op.add_option("--no_filter", default=True, action='store_false'
                  dest="filter", help="don't filter")

    opts, args = op.parse_args(sys.argv)

    format = opts.format
    run(opts.port)