So recently, I had a requirement to store unstructured JSON data that was coming back from a web service. The web service was returning back various soccer teams from around the world. Amongst the data contained in most of the soccer teams was a list of soccer players, who were part of the team. Some of the teams had 12 players, some had 20 some had even more than 20. The players had their own attribute, some were easy to predict some impossible. For the entire data structure, the only attribute that I knew would definitely be coming back was team's teamname. After that, it depended on each team.
{ "teams": [{ "teamname":"Kung fu pirates", "founded":1962, "players": [ {"name": "Robbie Fowler", "age": 56}, {"name": "Larry David", "age": 55} ... ]}, { "teamname":"Climate change observers", "founded":1942, "players": [ {"name": "Jim Carrey", "age": 26}, {"name": "Carl Craig", "age": 35} ... ]}, ... ] }There are several different ways to do store this data. I decided to go for MongoDB. Main reasons:
- I wanted to store the data in as close as possible format to the JSON responses I was getting back from the web service. This would mean, less code, less bugs, less hassle.
- I wanted something that had a low learning curve, had good documentation and good industry support (stackoverflow threads, blog posts etc)
- Something that had a grails plugin that was documented, had footfall and looked like it was maintained
- Features such as text stemming were nice to have's. Some support would have been nice, but it didn't need to be cutting age.
- Would support good JSON search facilities, indexing, etc
class Team implements Serializable { static mapWith = "mongo" static constraints = { } static mapping = { teamname index: true } String teamname List players static embedded = ['players'] }Regarding the Team domain object:
- The first point to make about the Team domain object was that I didn't even need to create it. The reason why I did use this approach was so that I could use GORM style api's such as Team.find() if I wanted to.
- Players are just a List of object. I didn't bother creating a Player object. I like the idea of always ensuring the players for the team were always in a List data structure, but I didn't see the need to type anything further.
- The players are marked as embedded. This means the team and players are stored in a single denormalised data structure. This allows - amongst other things - the ability to retrieve and manipulate the team data in a single database operation.
- I marked the teamname as a index.
- I marked the domain object as
static mapWith = "mongo"
This means that if I was also using another persistence solution with my GORM (postgres, MySQL, etc.) I am telling the GORM that this Team domain class is only for Mongo - keep your relational hands off it. See here for info. Note: This is a good reminder that the GORM is a higher level of abstraction than hibernate. It is possible to have a GORM object that doesn't use hibernate but instead goes to a NoSQL store and doesn't go near hibernate.
private ListOk, so the main points in our importTeams() methodimportTeams(int page) { def rs = restClient.get("teams") // invoke web service List teams = rs.responseData.teams.collect { teamResponse -> Team team = new Team(teamname: teamResponse.teamname) team.save(); // Save is needed to dynamically add the attribute teamname.each {key, value -> team["$key"] = value } teamname.save(); // We need the second save to ensure the variants get saved. return teamname } log.info("importTeams(),teams=teams); teams }
- After getting our JSON response we run a collect function on the teams array. This will create the Team domain objects.
- We use some meta programming to dynamically add any attribute that comes back in the JSON team structure to the Team object. Note: we have to invoke save() first to be able to dynamically add the attributes that are declared in the Team domain object to the Team domain object. We also have to invoke save() again to ensure that attributes that are declared in the Team domain object to ensure they are saved. This may change in future versions of the MongoDB plugin, but it is what I had to do to get it working (I was using MongoDB plugin version 3.0.1)
- A leaky abstraction is inevitable here. Sooner or later you are going to have to write a query that the GORM won't do very well. Better to approach this heads on.
- I wanted to able to run the queries in the Mongo console first, check explain plans if I needed to and then use the same query in my code. Easier to do this, if I write the query directly without having to worry about what the GORM is going to do.
teams = Team.collection.find(queryMap) // where queryMap is a map of fields and the various values you are searching for.Ok, some examples of queries...
Team.collection.find(["teamname":"hicks"]) // Find a team name hicks Team.collection.find(["teamname":"hicks", "players.name": "Robbie Fowler"] // As above but also has Robbie Fowler Team.collection.find(["players.name": "Robbie Fowler"] // Any teams that has a Robbie Fowler Team.collection.find(["teamname":"hicks", "players.name": "Robbie Fowler", {"players.$":1}] // Returns matching player only Team.collection.find(["teamname":"/ick/"]) // Match on the regular expression /ick/, i.e. any team that contains text ick.Anything else? Yeah sure. I wanted to connect to a Mongo instance on my own machine when in development but to a Mongo machine on a dedicated server in other environments (CI, stage, production). To do this, I updated my DataSource.groovy as:
environments { development { grails { mongo { host = "localhost" port = 27017 username = "test" password = "test" databaseName = "mydb" } } dataSource { dbCreate = "create-drop" // one of 'create', 'create-drop', 'update', 'validate', '' url = "jdbc:h2:mem:devDb;MVCC=TRUE;LOCK_TIMEOUT=10000" } } ci { println("In bamboo environment") grails { mongo { host = "10.157.192.99" port = 27017 username = "shop" password = "shop" databaseName = "tony" } } dataSource { dbCreate = "create-drop" // one of 'create', 'create-drop', 'update', 'validate', '' url = "jdbc:h2:mem:devDb;MVCC=TRUE;LOCK_TIMEOUT=10000" } } }You'll see I have configured multiple datasources (MongoDB and PostGres). I am not advocating using both MongoDB and a relational database, just pointing out it is possible. The other point is that the MongoDB configuration is always under:grails { mongo {
Ok this is a simple introductory post, I will try to post up something more sophisticated soon.
Until the next time, take care of yourselves.
Any particular reason why you went for Grails? Does it offer particularly good MongoDB integration? And have you tried using Java or PHP drivers with it?
ReplyDeleteThe context for this is that I've been getting interested in MongoDB lately, mainly because it the ability to store - and search on - unstructured data seems to offer a way out of the constraints imposed by relation database designs. (A perfectly understandable desire to avoid DB changes where possible can introduce odd distortions into a design over time.)
The project we were working on was already Grails. The dynamic typing of Groovy and the structureless format of MongoDB can play together - but where the lack of structure offers a lot of freedom and flexibility it also means you have to be careful that things don't go completely out of control with domain models etc. Haven't tried using Java or PHP with MongoDB.
ReplyDeleteThanks for sharing this blog post,Nice written skill Java online training
ReplyDelete