Many have done research in extracting Opinion phrase from customer review corpus using various approach viz LDA, Bag or Words. The results were depends on the data they handle. VOC or Opinion Phrase is about revealing the analytics which hidden in the user reviews in un structured way. Often the most important information that customer provides never being captured as a important data points for management decision about the target product /service.
In tataatsu labs , we got interesting customer review corpus from leading computer manufacturers to analyze and address their need. The problem statement was their CRM system is unable to locate the precious feedback that provided by customer. This adversely affects the management to improve the quality of the product/service.
All they need was a) which parts they like/dislike b) who are the Loyal customers 3) what other feature they wish to have from customer.
In a nutshell from analytics standpoint, we need a NLP engine that would slightly different from the naive approach as mentioned below:
Engine tech stack : python, nltk, apache storm,postgres,d3.js,opennlp
1) Data processing/analyzing
2) Data cleansing
3) Data augmentation
4) Feature extraction.
5) Data point for plotting/reporting.
Product – “Mac pro”
Source – “ecommerce Website”
Rating – “4/5″
Customer Review –
“I have had this laptop a good few weeks now, tested it fully and know the positives and negatives. I have been purchasing systems only from mac for 6 years, hopefully my review will help you out.Nice sleek, thin and comfortable designed laptop- the casing is very nice ridged texture that looks and feels brilliant. It does accumulate finger prints but nowhere near the level other systems do. It’s also light weight. Screen is nice and wide, perfect for work, gaming and browsing the web. It has a glossy glare which is really nice, easy to clean and good quality. Bright and details are great. The Keyboard is perfect, easy to clean, stylish, nice quality feel and durable. Remember, black absorbs heat and makes it fade. Unfortunately I left mine in the sun and the keys have faded (that is my fault not the product) I’m still looking to find a replacement, would be great if I could be directed. Trackpad feels great, very responsive and easy to use. Left and right clickers (I made that word up) are very nice and feel like they’re built with quality. However, sometimes the mouse freezes for anywhere up to 40 seconds when opening numerous programs or running scans. This does get a little bit frustrating especially knowing it’s an i3 processor. The palm rest has a sharp-ish edge on it, but I’m sure that will wear down with use. I would say it lacks a bit of performance as it should load easy tasks in seconds, it does not always respond when I ask it to and some applications fail to load as quick as they should. Most of the time it works perfectly, it laggs very rarely. Intel HD Graphics is an integrated graphics card that is basic but still very good. I play a standard browser based MMO that runs at 40-45 out of 50 Frames Per Second, for me that is very good. It also handles some games well such as Skyrim, Dirt and Farcry at medium graphics.Audio is great, very loud and clear.Webcam is very nice, perfect for streaming, very high detailed with no ‘jittering. Ports are good, the HDMI works perfectly and looks great with my TV. AC adapter is Standard yet very durable and lightweight. Windows8 isEasy to use, quickly able to adapt from taking a first time leap from XP all the way to the newest OS. At first I was a bit pessimistic, but I’m very happy with it’s design, features and ease of use. Service and delivery was great from mac as usual. A very satisfied customer and I won’t fail to recommend this product. Cons:Disk tray is flimsy and not very well built when open, it still performs well. The casing on the front of the screen (trim where the webcam is) feels ‘plasticy, weak and cheap, the back is ridged and a nice design though. Overall rating from me, 8/10. Some mildly rough patches but still a great system, the price is very decent for what you’re getting.”
Data processing and analyzing is the slice and dice process of the given data. It often require one to read the data multiple times and arrives the hidden features in there. This process helps to see the textual pattern for NLP process.
Data cleansing, should remove any unnecessary data as well outlier, unless otherwise it can reduce the overall accuracy and outcome of the process. There are multiple way to arrive outlier data with respect to the data domain involve. At times outlier would lead to different insights from the data. So we fed the outlier data into our generic processing engine to reveal different insights.
Data augmentation, usually not prefered by many nlp engine, but we strongly believe we need this steps that would definitely improve the precision/recall of the feature extraction. In this case we asked the vendor to provider their official list of parts and service name from their database. We stored it as lookup value to improve the accuracy while preparing feature extraction process.
Feature extraction for opinion phrase is very interesting and always challenging with unstructured data. The usual weapon here is the POS tagger and subsequent to that is Grammar chunk pattern filter.
e.g. phrase “keyboard is good”, “monitor is blur”
So the process was to prepare a list of chunker patterns like Noun-Verb, Adj-Verb… etc to identify the opinion phrase from the underlying sentence. Had to run many iteration to capture all the type of phrases. Later selected feature was verified against proprietary parts/service lookup database for accuracy and flagged the sentence for further process in the pipe.
Similarly, we had to prepare the candidate sentence for wishful items, loyal and outlier items.
Once we identified the sentence then we ran the sentiment for each sentence and took score out of it.
At this stage, we finalized the data points for next level analysis report.