Much data today is unstructured and comes from a variety of arenas of the organization. Unstructured data cannot be analyzed in the same manner and with the same techniques as structured data. However, they both may be used to make decisions. As you can recall from previous weeks structured data would be something like sales data for a region, by salespeople, by quarters and annual. Unstructured data however cannot be categorized or have a defined linear order.
Initial Post Instructions
Review the concept of unstructured data, outline the areas where this data can come from, evaluate the process on how this data is analyzed and used with structured data to make decisions. While evaluating the process make sure you discuss data mining (including text and web), multi-platform data architecture, data warehouse, data lakes and Hadoop.
After looking at the concept of unstructured data comprise a scenario where an organization may need or can search unstructured data for a specific purpose. For example, in 2001, the company Enron, went into financial ruins. The company was hiding financial losses through blatant accounting fraud by misrepresenting earning reports. This went on for several years before the practice was uncovered by the government. There were many corporate executives believed associated with the scandal and the government wanted to gather evidence on these officials for prosecution. The company had millions upon millions of emails during this period and needed to find emails pertinent to individuals and the case. The company used text mining to look for keywords to discover emails associated with the situation.
Secondary Post Instructions
For a secondary post suggest to your peers methods, based on their scenario, for improving finding the information needed.
- Initial post 350 words
- Cite 2 different sources in initial post
- Secondary posts 250 words
- Make sure one of your secondary posts is at least 12 hours after your initial secondary post