Beginners Guide on Time Series Forecasting.
Time Series Forecasting is not very popular among beginners due to its complex structure. However, it is one of the key solutions in solving a fundamental data problem — prediction of future outcomes when no future records are available. Breaking it down; what is Time Series in simple terms? Let’s find out in this article.
Fasten your seat belts without any further due let us get started!!
Time Series Forecasting
To understand Time Series Forecasting, we must first understand what a Time Series is.
A Time Series, as the name suggests, is a series of information that has been collected over time. For instance, the intensity of rain on all days in the month of August can be considered as a time series. August has 31 days and if against each day, the rain intensity is recorded and arranged in ascending order with respect to the date, the series will look something like this:
Date Information Aug 1st 30mm, Aug 2nd 45mm, Aug 3rd 16mm……
Now, we can comfortably understand Time Series Forecasting. Just with the data available as above — which is information against dates, we create a Time Series Forecasting Model. Suppose we are to predict the rain intensity on the 1st of September. In a machine learning problem, we will have data like the number of pedestrians on the street, traffic intensity, pollution, and more of such details to support our prediction of rain intensity.
However, as you must have already figured out, this data will not be available for future dates and thus, the only data we can be certain of is the future date — 1st of September. This is when it becomes a Time Series Forecasting Problem. Predicting future outcomes based on past time-based data. However, do note that Time Series Forecasting heavily differs from Time Series Analysis.
Time Series Analysis tries to describe the data at hand. It defines the reason behind the record analysis of the dates. Whereas, Time Series Forecasting takes on a predictive approach. It also goes through a few common stages of Time Series Analysis to understand the data in-depth. Therefore, it can be said that Time Series Forecasting has both descriptive and predictive nature.
Elements of Time Series
Like we saw in the previous section, the defining factor of a time series problem is its sole dependence on time-based features. Let us go through a few time-based elements that are key to Time Series problems -
- Level — Level can be considered as the average or baseline value in the series.
- Seasonality — This is a very important factor that notes the repetitive patterns in time. For instance, a customer is likely to order more on Fridays. That is a 7-day repetitive pattern.
- Trend — Trend is the increase or decrease of values of the target variable with time. This often has a linear pattern which implies there is mostly a progressive increase or continuous decrease over time.
- Cycles — Irregular cyclic patterns can show up in the data that might not be bound to either seasonal boundaries or particular trends.
- Noise — Noise is the random variations in data over time that cannot be explained either through trends, seasonality, or any other pattern.
Key Concepts of Forecasting
To jump into a time series problem, having a fundamental understanding of the following concepts is vital. Let us have a look at some commonly used techniques in time series.
1. Rolling features — Rolling features attempt to capture the average or any central feature of the past data. For example, rolling mean with a window of 3 days will calculate the mean of the last three days and populate it on the fourth day. This helps to capture increasing or decreasing trends in the data.
If you are looking to build time series forecasting model in python, the below code snippet can be used to create rolling features for a window of 3 days:
data['rolling_mean'] = data['net_amount'].rolling(window=3).mean()
Where ‘net_amount’ is the column on which the rolling mean is to be calculated.
2. Lagging Features — Lagging features are used to capture the seasonality of the model. If we create a lagging feature with a window of 7, it will take the value from 7 days before and populate it on the current date. This can be easily understood with the help of the order example. Suppose today is Friday, a lagging feature of window 7 will take the number of orders from last Friday and populate today’s record with that number. If the customer does have a tendency to order more on Fridays, the lagging feature will be easily able to capture that. Therefore, primarily, lagging features help in capturing seasonal patterns.
In python, the lagging feature for a window of 7 days can be constructed in the following way:
data['lag'] = data['number_of_orders'].shift(7)
3. ARIMA Model — ARIMA models are one of the most popular models used for solving time series problems. ARIMA stands for AutoRegressive Integrated Moving Average. As the name suggests, two methods are integrated in ARIMA — the AutoRegressive approach and the mean average factor.
The autoregressive model linearly combines past values of the target variable. The moving average model, on the other hand, linearly combines the errors of past predictions. This integration allows the model to learn from the mistakes during the training itself!
Use Cases of Time Series
Banking Sector: The banking sector is very sensitive since it deals with finances. It is always expected to be prepared. In banking, time series forecasting can be very effectively used to predict the loan amount requests for a future month. This helps the bank to reorganize the funds in advance. It can also be used to predict the total withdrawals and deposits on upcoming dates.
Manufacturing Sector: The manufacturing sector deals with heavy batch productions. Time series forecasting can help in predicting the batch quantities for each day in future months. This can help the manufacturer to estimate the overall profits and invest accordingly.
Agriculture: Agriculture is a key sector in our country and is also known to be heavily burdened with heavy exports and high demands in the domestic market. Technology can do wonders for this sector. A solution like time series can estimate production during harvests, predict the weather details like rain and sunlight intensities, and can also attempt to predict demand and supply curves in the market. This can help the farmers to prepare in advance.
COVID: Time series model has been heavily used in identifying future COVID cases across the globe. It has effectively helped many experts to identify the supplies and arrangements required to fight the global crisis.
Challenges in Time Series Forecasting
Like all Machine Learning models, Time Series Forecasting also has a set of challenges or concerns.
The staleness of the model: Over time, the trends, seasonality, and some other features of the data have a tendency to change. This makes the model old and calls for retraining on data that has been recorded on more recent dates.
Determination of Forecasting Frequency: Forecasting frequency refers to the frequency at which the predictions occur. For instance, in a weather forecasting problem, it is clear that the forecasts need to happen every day and the original data needs to be tallied against the forecasted data at the end of each day. However, there are many use cases where the daily tally is not feasible and an apt frequency needs to be decided upon.
Data Quality and Data Collection: As mentioned in the previous point, data collection is often challenging. For instance, population data is tough to find, and the data quality also cannot be vouched for owing to several factors like manual mistakes, absence of relevant data, restricted information and many others!
Long prediction range: Going ahead with the same example, population data can only be tallied in, say, a minimum of one to two years. This is a long-range of time for which predictions are to be generated. Unless backed by other supportive information, time-series predictions can turn out to be largely inaccurate over the long ranges.
Through the course of this article, we learned about Time Series Forecasting, Elements of Time Series Forecasting and some Key concepts, uses of Time Series Forecasting, and some forecasting Challenges.
I hope you enjoyed this article!
Everyone stay tuned!! To get my stories in your mailbox kindly subscribe to my newsletter.
Thank you for reading! Do not forget to give your claps and share your responses share it with a friend!
Originally published at https://fittechie.in.