Wednesday, January 25, 2012

Coining Phrases - Reverse Auto-Classification

Recently, I have been talking with some software folks who have created parsers for auto-classification. These engines will take a text or website, scan the contents and see how it matches to a given taxonomy. These parsers use a variety of techniques including statistics, semantics, word location to see how a given document matches a taxonomy. These are very impressive, and I hope to use one of these for our taxonomy to classify our content. One of the unique traits of our taxonomy is that is multidimensional, and we use those dimensions to show the different ways that businesses operate in our economy. We have been able to leverage this to find buyers, sellers and comparables for Mergers and Acquisitions. One of the problems with this approach is that users who are searching our database need to have a fairly in depth knowledge of the taxonomy in order to find what they need. To solve this problem, we decided we needed to create meta-terms or coined phrases that would represent different search criteria to apply to our taxonomy. Instead of, going from text to taxonomy we are going from taxonomy to text. We also use all the synonyms for all the nodes in our taxonomy. So instead of having users select their criteria from a series of drop-down boxes, they can type in a Google-like text box and the software will auto-fill matches against the table of coined phrases generated from our taxonomy.

I have a live sample on our MandAsoft site here: http://mandasoft.com/segments/searchlob.aspx. A good example is accountant software which is a coined phrase that will search for software for accountants. Here are the results for that search http://mandasoft.com/segmentview.aspx?SearchID=LOB99.
Tell me what you think.

Tuesday, January 17, 2012

Taxonomy Evolution Conundrum

Our team has been developing our taxonomy for almost ten years now. Our goal is to classify businesses by looking at how they operate, who they serve, and what they do, and our focus has been on media and software businesses. Needless to say over the last ten years, there have been major changes to the media and software industries with the introduction of smart phones, tablet computers, cloud computing, SaaS, virtualization, etc. To handle this evolution of the content we are classifying, we need to make sure our framework was solid and that the taxonomy could change with abilities to add nodes, merge nodes, link nodes, and to make sure our classifications migrated with the changes. However, change is never apparent when it happens. When we saw the first business operating in Social Networking, we originally had them classified basically as forums of user generated content, as opposed to editorial content. But as the business and technology took off, and showed itself to be a new business model, we realized we had to add the term Social Networking to our taxonomy. Now our problem was that we had to go back and re-evaluate our companies that were classified as forums and see if they were really Social Networks. One way to fix this problem is to have an auto-classifier, and you set up a new set of rules to recognize Social Networking. Then  you re-run the auto-classifier on those companies. But here is the conundrum, we noticed this evolution in business models because we had human eyes seeing the trend. How can you expect an auto-classifier to see that? What are your thoughts on this problem?

Monday, January 9, 2012

A Multi-Dimensional Quandary - Social Networking

Today our team had an interesting issue come up. As our taxonomy tries to model business, we try to keep on top of how business models evolve, and we find we have to reconsider terms and what they mean. The term that gave us pause today was Social Networking. We use a multi-dimensional taxonomy, where we have four distinct trees that model different aspects of a business and each one roughly answers the following questions: 1) who is the clientele of a business, 2) how does a company do business, 3) what problems does a business solve or subject area it specializes in, and 4) what channel does the business use to reach customers. So our problem was that we had Social Networking defined in our business solution tree, but then we found that Social Networking started to morph into something beyond Facebook, Twitter and LinkedIn. Shopping sites started to incorporate Social Networking into their businesses, and then Social Networking, no longer seemed like an "end: but a "means to an end". So our solution, which is far from the only solution was to add social networking to our channel tree, where it sits along with mobile and online  terms. I hate having the same term in multiple trees, but the term is now used in multiple contexts, and we do have repeated terms for different contexts. What are your thoughts?

Thursday, January 5, 2012

Taxonomy Mapping Engine in Use

Our team is in Annual Trends Report Mode. Yesterday we published our first of seven Trends Reports tracking Mergers & Acquisitions in the Media and Software Sectors. This report is for the media space. One part of the report is to collect the deals that for this space. We use an auto-population algorithm that fills the lists many times a day. We then use an Industry Map and rules set that maps the categorized deals into a simple flat taxonomy just used for this report. We can then compare sub-segments of the Sector to see which segments are performing better or worse. Note we do this for 7 different reports each with its own Industry Map and rules set. We never have to categorize a deal more than one time. The mapping engine puts the deals into the appropriate bins for that report. Check it out here.

Wednesday, January 4, 2012

Flying a Kite - Starting a Taxonomy

For Christmas, I got this wonderful book about the Brooklyn Bridge. It is called "The Great Bridge" by David McCullough. One part, he wrote about the first bridge to cross the Niagara Gorge built by Charles Ellet. The way Ellet started the bridge was to offer five dollars the first American boy who could fly a kite over to the Canadian side of the gorge. The bridge span was 1,010 feet, and young Homer Walsh won the prize. Ellet took the kite string that spanned the gorge, and tied successively heavier cords and pulled them across the gorge until he had a heavy cable spanning the gorge and from that he built his bridge.

This story reminded me of our team's first efforts of building a business taxonomy. We started with a simple flat set of categories, and then added a second set of categories. After that we migrated to hierarchical trees, and then to banyans, and today we are ever adding features and complexity to our business taxonomy. But we could not have gotten to where we are today, unless we had first tried our first simple solution to span our own problem.