After a decade of programming around normalized or structured data structures, era of big data moved focus back to unstructured data structures and further divided it into sub parts. Depending upon different data structures industry have offered different techniques to deal with.
Different Data Structures:
- Structured Data
Structured data mainly found in traditional database design and compose of various different data types to store text, images, media, etc. Other data sources includes OLAP, CSV, DBMS, etc.
Semi-Structured data includes text files with a defined pattern that enables parsing, such as XML data files that are self describing and defined using XML schema.
Quasi-Structured data includes textual data with erratic data formats that can be formatted using tools, such as web clickstream data. These data can be obtained from logs and hence web server logs are best suited as quasi-structured data where server logs are parsed and mined to discover usage patterns and uncover relationships and areas of interest on a website or groups of sites.
Unstructured Data has no inherent structure and available as text, pdf, images, videos, etc.