Forecastiong of the dataset B2B using ARIMA (python)
The first step after cleaning the data is to check it by examining the p-value. As we can see, the p-value is lower than 0.05, which indicates that the data is stationary. Therefore, we can proceed to use the real data for forecasting.
Next, the Partial Autocorrelation Function (PACF) and Autocorrelation Function (ACF) are required to determine the ARIMA model.
It is observed that the PACF begins to approach 0 after lag 2 and lag 3, and the same pattern is seen in the ACF. Therefore, the following ARIMA models are proposed: (2,0,2), (2,0,3), (3,0,2), and (3,0,3).
From these four models, the following errors were obtained:
From the three error calculations, it is evident that the ARIMA(3,0,3) model has the lowest error values for RMSE and MSE. Therefore, the model that will be used for forecasting sales revenue is ARIMA(3,0,3) to estimate the sales revenue for the next 30 days, utilizing data from the last 10 months.
Trend Analysis of the dataset B2B
To examine the presence of seasonality in the trend, the following plot was obtained:
It can be seen that there is no seasonal trend in sales revenue, so a deeper analysis is needed to understand when trends rise and fall.
Upon examining the specific dates when these fluctuations occur, it turns out these dates do not align with any events. Therefore, it can be concluded that B2B purchases drive this trend, based on each agent’s cash flow. This conclusion stems from further data analysis showing that certain agents purchase more than four times in a single month, but then only once—or not at all—in the following months. Occasionally, there is a single, large purchase within a month.
To address this, the sales team could coordinate with the partnership team to negotiate and create promotions for B2B or agents by linking them to event dates, encouraging more consistent purchasing behavior.