SAS and Big Data

With COVID restrictions allowing more people to work from home, including me, I’ve had more time to putter around in my garden. I’ve never considered myself a green thumb. Without the commute to work, I’ve had more time to pay attention to my plants. Before this, I’ve been known to kill a cactus or two but imagine my joy when I was able to grow a melon from a seed.

I guess being data-driven is in my bones, so I started to document my plant care. I started scribbling notes about what seeds I was sowing, how long it would take to germinate, what kind of soil I was using, how much water I was giving, light conditions, fertiliser, plant height, and on occasion, I would even test pH levels. I also took many pictures and videos like a proud plant parent would. It started as some sort of diary but eventually, I formalised things and started recording the information in a spreadsheet.

I wanted to be able to repeat my success. I knew that I was doing something right. I had an edible fruit as proof. As a newbie plant enthusiast, however, I didn’t really understand what it was.

Small Garden, Big Data

Data analytics is everywhere and can be applied to almost any part of our lives. In recent years, we’ve been hearing the buzzword Big Data. I think my planting escapades perfectly illustrates how Big Data can quite literally hit close to home.

On Variety: Structured vs Unstructured Data

Structured data is any information that you can store in table form. Each row is an observation, and each column an attribute. Loosely put, structured data is quantitative.

Quantitative data allows data scientists to find meaningful relationships between observed attributes. The most common quantitative data analysis methods include regression analysis, Monte Carlo simulation, factor analysis, cohort analysis, cluster analysis, and time series analysis. These methods have been around for decades and have guided much data-driven decision-making in the academe and business world.

Unstructured data is anything that cannot be stored in a table. Often, qualitative data falls under this category. Unstructured data can include long text format comments, remarks, emails, photos, audio files, videos, Twitter feeds, sensor feeds such as weather data or data from wearables.

IBM estimates that 2.5 quintillion bytes (2.5 million terabytes) of data are created on a daily basis. An estimated 80% of which is unstructured data. If we’re performing data analytics mainly on the 20% of structured data, we’re sitting on a gold mine of uncharted insight.

Advances in Insight from Unstructured Data

In recent years, Artificial Intelligence has propelled the analysis of unstructured data. AI is a collection of analytical methodologies and algorithms that reveal insight from both structured and unstructured data. Part of this collection of advanced analytics tools includes machine learning, computer vision, forecasting and optimisation, and natural language processing.

Further, cheaper and smaller sensors can now be embedded in almost anything. The readings can be stored digitally in a continuous feed. For a data scientist, that means that there are more unstructured data sources available for analysis.

Cloud-based technologies have provided data scientists with more access to the hardware and software needed for their analyses. Many vendors now offer their proprietary analytics tools as a service. Aside from SaaS, researchers have benefited from storage and computing power now made more affordable with cloud solutions.

Storage is now so accessible that researchers can choose to collect unstructured data for future purposes. They don’t have to create a data model for the information just yet. Efficient processing is still possible even on a future date. In the meantime, information can be dumped into a data lake and dealt with later.

Additionally, with data analytics being performed on the cloud, collaboration is made easier. Discoveries don’t need to happen in a single location. They can occur simultaneously from anywhere in the world.

Industry Uses

The applications of Big Data are endless. Each industry has an infinite number of use cases. For example:

Big Data, SAS, and MySASteam

SAS has a complete analytics platform designed for Big Data. Their data management and advanced analytics tools allow you to use information no matter the volume, velocity, and variety so that you extract the most value for your organisation. For expert advice on how to maximise SAS for your Big Data requirements, there’s always MySASteam.

I’m still figuring out what is causing my basil plants to fail, but I’m confident that the answer is just in the data. I hope you find ways to make data-backed decisions in your organisations and even in your daily lives.

The post SAS and Big Data appeared first on MySASteam.

Let’s Get the Conversation Going

Leave your details below so we can help find the best solution for your organisation.

    Let’s Get the Conversation Going

    At MySASteam, we believe that every SMB deserves the Power to Know. Please leave your details below so we can find the best solution for your organisation. We’ll call you.