Enhydrator With Nashorn Sink Released 📎
Enhydrator v0.6.2, the Apache licensed, Java 8 data transformation utility, is available from maven central:
<dependency>
<groupId>com.airhacks</groupId>
<artifactId>enhydrator</artifactId>
<version>0.6.2</version>
</dependency>
Now you can also implement the Enhydrator's output--the Sink completely with JavaScript / Nashorn (see ScriptableSink). A Sink implemented in JavaScript is particularly convenient for the generation of JSON documents.
In the example below I'm extracting the title and links from this blog to create a JSON document as first stage for the generation of archive.adam-bien.com:
public static Memory extractData(String dbUri, String fileName) {
String query = "SELECT w.TITLE, w.ANCHOR,w.TEXT,w.UPDATETIME FROM \"PUBLIC\".WEBLOGENTRY w
WHERE w.STATUS = \'PUBLISHED\' ORDER BY w.UPDATETIME DESC";
JDBCSource source = new JDBCSource.Configuration().
driver("org.hsqldb.jdbcDriver").
user("...").
password("...").
url("jdbc:hsqldb:hsql://" + dbUri + "/blog").
newSource();
Pump pump = new Pump.Engine().
from(source).
sqlQuery(query).
with("UPDATETIME", (input) -> {
Timestamp t = (Timestamp) input;
return t.toLocalDateTime().format(DateTimeFormatter.ISO_LOCAL_DATE);
}).
to(new ScriptableSink("*", fileName)).
build();
return pump.start();
}
The ScriptableSink reads the following script from the external file:
var File = Packages.java.nio.file;
var Stream = Packages.java.util.stream;
function writeFile(file, content) {
print("Writing to: " + file);
var path = File.Paths.get(file);
File.Files.write(path, String(content).bytes);
}
var context = function () {
var count = 0;
return{
uri: "http://www.adam-bien.com/roller/abien/entry/"
,
increment: function () {
count++;
},
counterValue: function () {
return count;
},
entries: []
};
}();
function init() {
print("init");
}
function processRow(entries) {
var cols = entries.getColumnValues();
var post = {
pub: String(cols.UPDATETIME.get()),
title: cols.TITLE.get(),
link: context.uri + cols.ANCHOR.get()
};
context.entries.push(post);
context.increment();
}
function close() {
print("Processed: ", context.counterValue());
var blog = {
generatedAt: new Date().toLocaleDateString(),
entries: context.entries
};
var serialized = JSON.stringify(blog);
writeFile("./input/index.json", serialized);
}
...and generates the following JSON object stored in index.json:
{"generatedAt":"2015-11-06","entries":[
{"pub":"2015-10-31","title":"Blog Archive Is Available","link":"http://www.adam-bien.com/roller/abien/entry/blog_archive_is_available"},(...)
}]}
The index.json is merged with the index.htm template by github.com/AdamBien/spg into the statically generated page archive.adam-bien.com. The whole process of archive generation: extracting > 1400 articles from the DB, launching Enhydrator and SPG in 2 JVMs takes 5 seconds in total.
The extraction is not specific to my blog and should work with all rollerweblogger blogs. A slightly modified pipeline could be used for migrations, backups or indexing.
See you at Java EE Workshops at Munich Airport, Terminal 2 or Virtual Dedicated Workshops / consulting. Is Munich's airport too far? Learn from home: effectivejavaee.com.