Welcome to JamiaYant.com

Click on the icons below to view information

Home

Home

Biography

Biography

Resume

Resume

Web Development and Design

Web Developoment

Projects

Projects

Articles & Writings

Articles

Contact Me

Contact Me

Data Mining
By Jamia Yant
June 1st, 2012

 

Predictive Analytics and Customer Behavior
“Predictive analysis is the decision science that removes guesswork out of the decision-making process and applies proven scientific guidelines to find right solution in the shortest time possible.” (Kaith, 2011)  There are seven steps to Predictive Analytics: spot the business problem, explore various data sources, extract patterns from data, build a sample model using data and problem, Clarify data – find valuable factors – generate new variables, construct a predictive model using sampling and validate and deploy the model.  By using this method, businesses can make fast decisions using vast amounts of data.  There are three main benefits of predictive analytics: minimizing risk, indentifying fraud, and pursuing new sources of revenue.  Being able to predict the risks involved with loan and credit origination, fraudulent insurance claims, and making predictions with regard to promotional offers and coupons are all examples of these benefits.  It basically reduces the cost of making mistakes.  This type of algorithm allows businesses to test all sorts of situations and scenarios it could take years to test in the real world.  Studying customer behavior gives businesses a competitive advantage and allows them to stay ahead of the competition in their market place.

 
Associations Discovery and Customer Purchases
Association analysis is useful for discovering interesting relationships hidden in large amounts of data.  There are two things to remember when using association analysis with regard to market data:  discovering patterns from a large transaction data set can be computationally expensive and some of the discovered patterns are potentially spurious because they may happen simply by chance.  
Association discovery finds rules about items that appear together in an event such as a purchase transaction. Market-basket analysis is a well-known example of association discovery.  This algorithm is used for recommendation engines.  These engines are used to recommend products to customers based on items they have already bought or shown interest in.  This provides a benefit to the business by allowing them to effectively stage their products, as well as, knowing which customers to target for specific promotions or new products.  (Two Crows Corp, 1999)


Web Mining to Discover Business Intelligence from Web Customers
Web data mining is the process of extracting structured information from unstructured or semi-structured web data sources.)    Companies use web data mining as a tool to gather data from different websites and collate it together to do analysis, build websites which provide information from different websites. It helps the visitors to get a lot of information in one location instead of reading information from different websites. For business intelligence, competitiveness in the markets of ecommerce and the vast number of options customers have today have forced business’s to employ marketing strategies that are built largely on data mined from web mining.  Web usage mining is critical for effective Web site management, creating adaptive Web sites, business and support services, personalization, network traffic flow analysis and more.  Business intelligence keeps a business informed of market trends, alerts about new avenues of generating revenue, and helps determine the status of the competition.


Clustering and Customer Information
Clustering analysis subdivides a market into distinct subsets of customers where any subset may potentially be selected as a market target to be reached with a distinct marketing mix.  This type of analysis finds clusters of data objects that are similar in some sense to one another and segments that data. (Oracle.com, 2008)
Businesses today collect information about what pages site users visit, and about the order in which the pages are visited. Because the business provides online ordering, customers must log in to the site. This provides the company with click information for each customer profile. By using a clustering algorithm on this data, the business can find groups, or clusters, of customers who have similar patterns or sequences of clicks. The business can then use these clusters to analyze how users move through the Web site, to identify which pages are most closely related to the sale of a particular product, and to predict which pages are most likely to be visited next.


Reliability of Data Mining Algorithms
Reliability of the data mining algorithms has opportunity for error and misuse.  The algorithm is only going to be reliable if they have gone through sufficient validation testing.  The results must be validated.  Not all patterns discovered with data mining algorithms are going to be valid.  It is possible for a pattern to be discovered in the test data but not in the general population of the data.  There are three ways of measuring data mining: accuracy, reliability and usefulness.  Accuracy measures how the model correlates an outcome with the attributes in the data that has been provided.  Reliability focuses on how that mining model performs using different sets of data. And Usefulness examines various metrics that tell you whether the model provides useful information.  It is possible for  users of the algorithm to ask the wrong question, fail to test the reasonableness of the results, ignoring discrepancies in the data, ignoring simple explanations and building overly complex models, over generalizing from the results, using insufficient or inadequate data or using a single data analysis tool. 


Privacy Concerns when Data Mining Personal Information
In order to perform data mining, information must be gathered to enter into the system. This information can contain private or confidential information that an individual did not release to a third party. The data can also contain identifying information about the individuals that once the data mine is performed is no longer anonymous.  This can be a problem with regard to privacy.  Privacy is the right of individual’s to control information about them.  With data mining there are some valid concerns and they revolve around secondary use of the personal information, handling misinformation, and granulated access to personal information.  The data collected could pose potential risks to the privacy of persons or organizations.  These risks are not limited to theft by fraud, actual identifications or incorrect identification that could threaten a person’s life, livelihood, or reputation.  There are documented cases where individuals have obtained a person’s address and then physically did them harm.  Pedophiles thrive on this type of data that as technology advances leaves individuals vulnerable to all sorts of attacks.  Hackers breaking into large company databases have left many individuals the victim of identity fraud. 
There are both mandatory and voluntary controls that cushion some of these concerns.  There are legal restrictions on the use of information and action that can be taken in the event of such activity but it is a cumbersome and arduous process that can take a long period of time to recover from.  Sadly, the laws are lagging behind technology and insufficient to protect individuals alone.  The voluntary controls consist of technical, methodological and policy approaches to limit opportunities for inappropriate access to insure the sound data with a desired outcome.   In some cases the consumer has no choice; we all have to give up certain information to buy homes, vehicles and other necessities of life.  But one should exercise caution from giving up essential personal information unnecessarily.


Sample Business’s that have Incorporated Predictive Analysis into their Business Successfully

  • Blue Cross and Blue Shield System (BCBS) is one organization that is already deriving considerable benefits from predictive analytics. As an organization that provides healthcare insurance to nearly one in three Americans, BCBS has amassed a huge amount of claims-related data over the years.  By applying predictive analytics technologies to its vast trove of claims data, BCBS has been getting better at not only identifying the risk factors that lead to several chronic diseases, but also identifying individuals who are at heightened risk of getting such diseases (Vijayon, 2011)
  • Memphis Police Department (MPD) has enhanced its crime fighting techniques with IBM predictive analytics software and reduced serious crime by more than 30 percent, including a 15 percent reduction in violent crimes since 2006. MPD is now able to evaluate incident patterns throughout the city and forecast criminal "hot spots" to proactively allocate resources and deploy personnel, resulting in improved force effectiveness and increased public safety. (Armonk, 2010)
  • Target used predictive analytics to determine based on past purchases if a woman could be pregnant. Target assigns every customer a Guest ID number, tied to their credit card, name, or email address that becomes a bucket that stores a history of everything they’ve bought and any demographic information Target has collected from them or bought from other sources. Using that, Target looked at historical buying data for all the ladies who had signed up for Target baby registries in the past.  They successfully used this information to send out target coupons to improve the sales of their maternity and baby products. (Hill, 2011)

References:

  • Two Crows Corporation (1999) Introduction to Data Mining and Knowledge Discovery, http://www.twocrows.com/intro-dm.pdf
  • Angoss (2012) Predictive Analytics in the Cloud Solutions, http://www.angoss.com/predictive-analytics-solutions/cloud-solutions
  • Oracle.com (2008) Oracle Data Mining Concepts, http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/clustering.htm
  • Tiwari,S. (2011) A Web Usage Mining Framework for Business Intelligence, http://www.ijecct.org/v1n1/4.pdf
  • Kaith, R. (2011) Benefits of Predictive Analytics and Data Mining Services, http://www.articlesnatch.com/Article/Benefits-Of-Predictive-Analytics-And-Data-Mining-Services/1394544#ixzz1wTRRkxKw
  • Vijayan, J. (2011) How Predictive Analytics can Deliver Strategic Benefits, ComputerWorld.com,http://www.computerworld.com/s/article/9220131/How_predictive_analytics_can_deliver_strategic_benefits
  • Armonk (2010) IBM: Memphis Police Department Reduces Crime Rates with IBM Predictive Analytics Software, http://www-03.ibm.com/press/us/en/pressrelease/32169.wss
  • Hill, K. (2011) How Target Figured Out a Teen Girl Was Pregnant Before Her Father Did, Forbes.com , http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/

Click Icon for PDF Download

Article