Wednesday, June 17, 2015

NoSQL- structured vs unstructured what's better?

I recently had a conversation with a friend of mine, Dan Torres who is one of the most brilliant SQL data gurus I know and has taught me most of the meaningful SQL knowledge I have. We were talking about how I was adopting NoSQL, and he was concerned. He has built some extremely large and meaningful data sets with SQL and he said he had trouble with the lack of data structure in most applications of NoSQL.

I do agree that with structured data you can get some very meaningful insight out of your data, and that NoSQL does allow for looser data structure. But I think that with NoSQL you can build an evolutionary structure that evolves as your data evolves. I have more than once found that data modeling decisions I have made on day one of a project need to be changed and depending on what the change is, it can be quite painful to change in a SQL database.
In addition, often your inbound data is not structured and it can be quite challenging to convert that unstructured data into structured data. My team has dedicated data entry people who are looking at RSS feeds and extracting and entering deal information field by field into our database. This is not an automated process, though we have put some automated helpers in place. 

So part of my exploration of NoSQL is to see how it can help us tease structure and meaning out of unstructured data. We have built semantic technologies to see if we can help our research group classify deals and companies from unstructured text with some success, and I am hoping NoSQL can help us be more successful. 

Looking at Google and Yahoo!, we see they are addressing the same problem at a much greater scale. They are taking all the web pages on the web, which are extremely unstructured, and are gleaning structure and meaning from them so that we can find them in their search applications. And I know that they are not using SQL to structure their data. 

No comments:

Post a Comment