Figure 4 Monnappa 2018
Firstly, what is BIG DATA:
Now, let me introduce the general background of big data. Gartner IT Glossary (n.d.) define Big data as “high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making”.
Put simply, big data is a large and complex data set that cannot be captured, managed and processed using a traditional database, hardware and software with an endurable timeframe because the size of it is too large (Chen et al., 2014). In today’s digital age, almost every digital action we take leaves a digital footprint. Our data is generated whenever we go online, such as when we carry our smartphones with GPS, when we shop online with our login user name and password, when we communicate with friends and family via social media platforms as well as when we swipe our Medicare card after we visit the doctors or use a Flybuys card to collect points after shopping in Coles.
A “3Vs” model has been used to define big data and this definition had been widely used by big research departments, such as Microsoft and IBM for ten years since 2001 (Laney, 2001). Big data has one of the following characteristics – high-volume, high-velocity or high-variety.
Chen et al. (2014) refer volume to the size and the scale of big data. The amount of data that is being collected and generated by the latest advances of information technology as well as the rapid growth of cloud computing just keeps growing. By 2020 the amount of digital data that we will need to store will have grown from 5 zettabytes today to 50 zettabytes (Marr, 2018).
Velocity refers to the speed with which the data is generated, analysed and reprocessed. (Chen et al., 2014, p.173). Let’s take WordPress as an example. We publish a post on WordPress and it has to handle numerous photos, images, text and video every day. It has to ingest it all, process it, categorise it, file it and label it to be able to retrieve it.
Gandomi & Haider (2015) refer to variety as “the structural heterogeneity in a dataset”. Put simply, it refers to different data types and data sources. There are three types of data, they are structured, semi-structured and unstructured data. Data from spreadsheets and relational databases are called tabular data and it is structured data. Unstructured data examples are text, images, audio and video. Extensible Markup Language (XML) is an example of semi-structured data.
Value has emerged over the past few years. The 4Vs definition model was introduced by NIST and widely recognised since finding value in big data is a discovery process that add assets to businesses (Chen et al. 2014, p.174).
Figure 5 Chen et al. 2014
In addition, veracity has been added to define big data (Marr, 2015). It refers to “the messiness or trustworthiness of the data”. How truthful is the data that business collected and how much can you rely on it will determine the data’s value.
Big Data is changing our world completely, an overwhelming amount of data provides evolutionary insight and opportunity to all businesses across every industry. Businesses can accurately predict consumers’ shopping patterns of particular products or services, however, it also raises concerns such as data privacy and security. Should we worry about how organisations take advantage of data?
Chen, M, Mao, S & Liu, Y 2014, ‘Big data: a survey.’ Mobile Networks and Applications, vol. 19, no. 2, pp. 171-209.
Gandomi, A & Haider, H 2015, ‘Beyond the hype: Big data concepts, methods, and analytics’, International Journal of Information Management, vol. 35, no.2, pp. 137-144.
Laney, D 2001, ‘3-d data management: controlling data volume, velocity and variety’, META Group Research Note, 2001, viewed 19 September 2018, https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf
Marr, B 2018, ‘What is big data? a super simple explanation for everyone’. Bernard Marr & Co. viewed 19 September 2018, https://www.bernardmarr.com/default.asp?contentID=766
Marr, B 2015, ‘Why only one of the 5 Vs of big data really matters’ IBM Big Data & Analytics Hub, web log post, 19 March, viewed 19 September 2018, https://www.ibmbigdatahub.com/blog/why-only-one-5-vs-big-data-really-matters
Monnappa, A 2018, ‘Data Science vs. Big Data vs. Data Analytics’, image, viewed 19 September 2018, https://www.simplilearn.com/data-science-vs-big-data-vs-data-analytics-article
World Economic Forum, 2016, What is big data? video recording, YouTube, viewed 19 September 2018, https://youtu.be/eVSfJhssXUA