Most traditional data is structured , i.e., it can be stored in well-defined rows and columns. Legacy transaction systems are an example of structured data: all transactions are stored in relational database management systems (RDBMSs), with each row representing one transaction and each column representing attributes of that transaction
Semi-structured data is partially stored in a well-defined database structure. Think of an XML file, which stores data but is not as well-defined as a database table.
Unstructured data cannot be categorized as structured or semi-structured and do not have a well-defined structure associated with how it is stored. Think of most tweets or blogposts. They contain relevant information to be mined, but this information is not structured. Special techniques are applied to extract this information. (See Figure)