In SQL, grouping a count for a given time granularity within a date range would look something like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SELECT DATE(date), count(*) | |
FROM collection | |
WHERE stuffDate > from AND stuffDate < to | |
GROUP BY DATE(date); |
It requires you to know about scope variables. Problem is, I found that scope variables are not very well documented. You can find more about scope variables in the MongoDB documentation MapReduce-Overview. The relevant parts are:
[, scope : <object where fields go into javascript global scope >]
and
scope - can pass in variables that can be access from map/reduce/finalize.
Back to this example. First, let's define some data:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"_id": "4f0c56f1b8eea0b686189c90", | |
"meh": "meh", | |
"feh": "feh", | |
"arrayOfStuff": [ | |
{ | |
"name": "Elgin City", | |
"date": "2012-01-06T14:54:21.000Z" | |
}, | |
{ | |
"name": "Rangers", | |
"date": "2012-02-02T11:01:27.000Z" | |
}, | |
{ | |
"name": "Arsenal", | |
"date": "2012-02-03T10:56:23.000Z" | |
} | |
] | |
} | |
{ | |
"_id": "4f0c56f1b8eea0b686189c99", | |
"meh": "meh meh meh meh", | |
"feh": "feh feh feh feh feh feh", | |
"arrayOfStuff": [ | |
{ | |
"name": "Satriani", | |
"date": "2011-11-01T11:51:46.000Z" | |
}, | |
{ | |
"name": "Vai", | |
"date": "2012-01-01T15:16:21.000Z" | |
}, | |
{ | |
"name": "Johnson", | |
"date": "2012-03-01T12:11:27.000Z" | |
} | |
] | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
m = function() { | |
if(this.arrayOfStuff) { | |
this.arrayOfStuff.forEach(function(stuff) { | |
if(stuff.date > from && stuff.date < to) { | |
var date = Date.UTC(stuff.date.getFullYear(), stuff.date.getMonth(), stuff.date.getDate(), stuff.date.getHours()); | |
emit({day: date}, {count:1}); | |
} | |
}); | |
} | |
}; | |
r = function(key , values) { | |
var total = 0; | |
values.forEach(function(v) { | |
total += v.count; | |
}); | |
return {count : total}; | |
}; | |
from = ISODate("2012-01-06T14:54:20.000Z"); | |
to = ISODate("2012-02-02T11:01:28.000Z"); | |
res = db.collection.mapReduce(m, r, { query : { "meh" : "meh"}, out : "hackola", scope : {"from": from, "to": to}}); | |
db.hackola.find(); |
Well, we're selecting the sub-document of this collection where the value of "meh" is "meh". Then we've defined two dates; from and to to represent the boundaries of the date range, we're including these within the MapReduce function call. Basically what this means is that we can use what ever is defined here in the Map function (btw, we can also use them in the Reduce and Finalize functions).
Once we have this working from the shell, it is straight forward to implement it. This is the very same implemented in Java.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DBCollection coll = db.getCollection("collection"); | |
String map = "function() {" + | |
"if(this.arrayOfStuff) {" + | |
"this.arrayOfStuff.forEach(function(stuff) {" + | |
"if(stuff.date > from && stuff.date < to) {" + | |
"var date = Date.UTC(stuff.date.getFullYear(), stuff.date.getMonth(), stuff.date.getDate(), stuff.date.getHours());" + | |
"emit({day: date}, {count:1});" + | |
"}" + | |
"});" + | |
"}" + | |
"};"; | |
String reduce = "function(key , values) {" + | |
"var total = 0;" + | |
"values.forEach(function(v) {" + | |
"total += v.count;" + | |
"});" + | |
"return {count : total};" + | |
"};"; | |
DBObject query = new BasicDBObject(); | |
query.put("meh", meh); | |
MapReduceCommand cmd = new MapReduceCommand(coll, map, reduce, null, MapReduceCommand.OutputType.INLINE, query); | |
Map scope = new HashMap(); | |
scope.put("from", from); | |
scope.put("to", to); | |
cmd.setScope(scope); | |
MapReduceOutput out = coll.mapReduce(cmd); |
Comment from http://www.reddit.com/user/Madd0g
ReplyDeleteis it just me or is there no example on how the to/from variables are created in java?
anyway, great example, on the very few occassions when I needed to pass params, I just injected them to the js function... which always felt wrong
jsCompareCode = @"if (" + jsWordArray + ".indexOf(this.Words[item]) != -1) continue;";
I'm a bit confused by your example. Specifically, this part...
ReplyDelete"this.arrayOfStuff.forEach(function(hit) {...}"
You reference "hit" in your function but then inside the block you refer to "stuff"...am I missing something?
Thanks for reading and taking the time to comment.
DeleteYour're 100% correct - I would have be thoroughly confused myself.
The 'hit' referenced in both the shell and the java implementations were both typos; this typo (heh) thing happens when you take actual code, reduce it and then obfuscate it for presentational purposes. Both errors have been fixed.
Does it make sense now? Is there anything that could be explained in more detail?
Excellent study! I really like this. I really like this a lot, how the money grubbing visual-data gnome within me personally desires much more! Many thanks with regard to discussing!
ReplyDeleteLOL Boost
Cheap FUT 14 Coins