adam bien's blog

Enhydrator With Nashorn Sink Released 📎

Enhydrator v0.6.2, the Apache licensed, Java 8 data transformation utility, is available from maven central:


 <dependency>
	<groupId>com.airhacks</groupId>
	<artifactId>enhydrator</artifactId>
	<version>0.6.2</version>
</dependency>

Now you can also implement the Enhydrator's output--the Sink completely with JavaScript / Nashorn (see ScriptableSink). A Sink implemented in JavaScript is particularly convenient for the generation of JSON documents.

In the example below I'm extracting the title and links from this blog to create a JSON document as first stage for the generation of archive.adam-bien.com:


  public static Memory extractData(String dbUri, String fileName) {
        String query = "SELECT w.TITLE, w.ANCHOR,w.TEXT,w.UPDATETIME FROM \"PUBLIC\".WEBLOGENTRY w 
        				WHERE w.STATUS = \'PUBLISHED\' ORDER BY w.UPDATETIME DESC";
        JDBCSource source = new JDBCSource.Configuration().
                driver("org.hsqldb.jdbcDriver").
                user("...").
                password("...").
                url("jdbc:hsqldb:hsql://" + dbUri + "/blog").
                newSource();

        Pump pump = new Pump.Engine().
                from(source).
                sqlQuery(query).
                with("UPDATETIME", (input) -> {
                    Timestamp t = (Timestamp) input;
                    return t.toLocalDateTime().format(DateTimeFormatter.ISO_LOCAL_DATE);
                }).
                to(new ScriptableSink("*", fileName)).
                build();
        return pump.start();
    }

The ScriptableSink reads the following script from the external file:


var File = Packages.java.nio.file;
var Stream = Packages.java.util.stream;


function writeFile(file, content) {
    print("Writing to: " + file);
    var path = File.Paths.get(file);
    File.Files.write(path, String(content).bytes);
}

var context = function () {
    var count = 0;
    return{
        uri: "http://www.adam-bien.com/roller/abien/entry/"
        ,
        increment: function () {
            count++;
        },
        counterValue: function () {
            return count;
        },
        entries: []
    };
}();




function init() {
    print("init");
}


function processRow(entries) {
    var cols = entries.getColumnValues();
    var post = {
        pub: String(cols.UPDATETIME.get()),
        title: cols.TITLE.get(),
        link: context.uri + cols.ANCHOR.get()
    };
    context.entries.push(post);
    context.increment();

}


function close() {
    print("Processed: ", context.counterValue());
    var blog = {
        generatedAt: new Date().toLocaleDateString(),
        entries: context.entries
    };
    var serialized = JSON.stringify(blog);
    writeFile("./input/index.json", serialized);
}

...and generates the following JSON object stored in index.json:


{"generatedAt":"2015-11-06","entries":[
{"pub":"2015-10-31","title":"Blog Archive Is Available","link":"http://www.adam-bien.com/roller/abien/entry/blog_archive_is_available"},(...)
}]}

The index.json is merged with the index.htm template by github.com/AdamBien/spg into the statically generated page archive.adam-bien.com. The whole process of archive generation: extracting > 1400 articles from the DB, launching Enhydrator and SPG in 2 JVMs takes 5 seconds in total.

The extraction is not specific to my blog and should work with all rollerweblogger blogs. A slightly modified pipeline could be used for migrations, backups or indexing.

See you at Java EE Workshops at Munich Airport, Terminal 2 or Virtual Dedicated Workshops / consulting. Is Munich's airport too far? Learn from home: effectivejavaee.com.