A time series analysis and trend analysis on the inflation rate of the Philippines from 1995 to 2023
Discover MoreDid you know that in 2012, bangus (or milkfish) was P76 less expensive per kilo than in 2022 (Statista, 2023)? As prices continue to rise over time, people are finding it more and more difficult to afford things that they used to be able to afford. Will the general public be able to keep up with the increase in prices moving forward? What exactly are the factors that affect the apparent increase in prices? How worried should we be given that inflation is being viewed as an “urgent national concern?”
The title of “the Sick Man of Asia” has once again been given to the Philippines.
In recent years, the Philippines has experienced a plethora of problems regarding inflation, as the average yearly inflation has soared to 6.6%, which makes the Philippines the country with the highest inflation rate among all Southeast Asian nations ever since the pandemic.
However despite the increase of inflation and overall price of goods, majority of the workers have yet to receive an increase of wage. In fact, majority of the full-time workers receive an average monthly basic pay of PHP 13,000. That is why, in the Philippines, inflation is mostly perceived in a negative light due to the concern that it has adverse effects on the people, especially for those with a limited income.
That is why we decided to focus on inflation rate. We wanted to see the trend of inflation rate and determine the possible factors or events that ultimately affect the ongoing worry of being able to afford basic needs and wants.
Specifically, the lack of awareness and understanding about inflation and its effects causes a negative perception on the topic. This leaves policymakers, businesses, and the general public, especially those in lower income brackets, vulnerable and unprepared to deal with the rapid changes in inflation and the effects that come with it.
We want to use data science to gather facts about inflation, analyze its trends, and possibly predict inflation rates to increase awareness, change our perception of inflation, and possibly take steps to minimize its effects on our lives.
How does the inflation rate (of each commodity group) change throughout the years?
How much does the rate of NCR differ from or affect the rate of areas outside NCR (AONCR)?
Null Hypothesis: Inflation rates of the commodities in the "Significant Group" increase over time while the rest does not increase or decrease significantly over time.
Alternative Hypothesis: Inflation rates of the commodities in the "Significant Group" do not increase over time while the rest increase/decrease significantly.
Null Hypothesis: There is a correlation (significant relationship) between the inflation rates of the "All Items" commodity group of NCR and AONCR.
Alternative Hypothesis: There is no correlation (insignificant relationship) between the inflation rates of the "All Items" commodity group of NCR and AONCR.
"Alcoholic Beverages and Tobacco", "Transport", "Food and Non-Alcoholic Beverages", "Health", "Housing, Water, Electricity, Gas, and other Fuels", "All Items"
How will we solve our problem?
Our plan is to use statistics and plot the points to visualize and analyze the trend of inflation and identify possible events and possible causes of (upward or downward) spikes in the inflation rate.
Get the original dataset
After searching the internet, we found a dataset from the website of Bangko Sentral ng Pilipinas (BSP) containing information about inflation, purchasing power, and CPI collected by the Philippine Statistics Authority (PSA).
Learn more about the data collection process
Check out our dataset here!
Making sure that the data is ready is always the first step! From checking missing values to changing types of data and adding columns, we need to ensure that everything is ready for action!
Before diving deeper into the realm of numbers, we created multiple graphs, comparing subsets of the data to one another.
After a better understanding of how the inflation rates move, we used linear regression with p-value and cross correlation to determine the relationship between the different commodity groups and areas.
For our machine learning model, we decided to perform the Box-Jenkins methodology on the data. The Box-Jenkins methodology is a set of steps that helps identify the parameters for the Autoregressive Integrated Moving Average Model, or ARIMA(p, d, q) model, or its seasonal counterpart, SARIMAX(p, d, q)(P, D, Q)s.
The first stage of the Box-Jenkins methodology is to identify different aspects of the data such as whether it’s stationary, its seasonality, and differencing order.
This phase creates the machine learning model and checks its performance by measuring the error between the predictions made by the model and the actual inflation rate data.
After creating the models and calculating its prediction errors, we tried to make sense of them by analyzing what the values mean and how they impact the overall prediction of the inflation rate.
So… what exactly did we find and what does this tell us?
To test the hypothesis, a simple linear regression test based on the OLS model was used to determine the general slope of the line of inflation when graphed against time.
For most of the results, the regression yielded a positive coefficient, except for the two commodity groups, “Alcoholic Beverages and Tobacco”, and “Recreation, Sport, and Culture”.
A positive coefficient simply indicates that the coefficients have an increasing trend with respect to time while a negative coefficient indicates a decreasing trend.
This simply indicates that while the commodity groups show a trend (increase or decrease) over time, it is not that significant. That is, there is not much of a linear trend between the commodity groups and time.
The graph simply indicates at time lag 0, there is a significantly high correlation between the two. But at different time lags, the correlation fluctuates around 0, suggesting that there is no significant correlation between the two time series when lag is applied.
The Granger Causality simply backs this up. Since the p-value fluctuates depending on the lag, it essentially means that while there is a slight correlation between each variable, one cannot be used to predict the other.
This means that we can use NCR data to predict the inflation rate of AONCR and vice versa, but only up to a certain point. This indicates the fluctuating nature of inflation and how hard it is to predict.
When predicting the inflation rate of the groups in the significant commodity groups, most models had a generally high root mean squared error (RMSE) value, indicating that the model was not able to predict the inflation rate well.
Based on the actual inflation rate vs predicted inflation rate graphs (can be found in the Colab), we see that the predictions were generally incredibly far off from the actual inflation rates.
This tells us that the inflation rate, by nature, is highly unpredictable and is very hard to model. This is due to the unpredictable nature of humans. Since inflation rate is dependent on many real-world factors such as demand, policies, and the general economy, it can be very hard to measure and forecast.
It might seem that the behavior of inflation is unstable and unpredictable, but we can decompose its behavior into respective components. While there seems to be an increasing trend in inflation rate, it is not strictly increasing either — we observed that the data is actually mostly stationary and seasonal, meaning there are patterns that we can observe and use as reference. Since inflation seems to be stationary and positive, we can guess that prices will continue to increase at the same rate it has before.
The inflation rate also fluctuates rapidly depending on the state of the economy while following a stationary average. Furthermore, it is hard to predict whether the inflation rate of the upcoming month or year will significantly increase or decrease from the present, although we can be sure that inflation will continuously fluctuate up and down, as it always has.
Now that we have a better understanding on the trends of inflation rate and how hard it is to predict the inflation rate, what does this tell us?
This tells us that as much as possible, we should try to be aware of the overall economy and how it can affect us. A lack of understanding of inflation and how it changes can leave us vulnerable to the possible dangers of its changes. It is never a bad characteristic to understand the economy and be aware of its changes because, in the long run, it will help us make better decisions now especially since it is very unpredictable by nature.
Another interesting, or maybe not so interesting, thing is that it is extremely difficult to predict the inflation rate. Since we are purely limited to historical data and the parameters that these may provide to us, we are limited to the capabilities of mathematical models.
Meanwhile, inflation is very much influenced by real-world events happening in real time. While we cannot control the economy, the inflation rate, or the prices of goods and services, by knowing and understanding these concepts, we can become more conscious about how we earn and spend our money to protect ourselves from the impacts inflation can bring.
What can we do?
Have a basic understanding of how the economy works. Even as simple as understanding what affects the changes in price and inflation rate can go a long way as it helps us understand how these concepts affect our livelihoods.
You can never be too prepared. When the pandemic hit, many people lost their jobs and were forced into tight financial situations. Always have a contingency plan for these rare but very possible scenarios.
Consider which commodities are necessities. It is always better to have extra funds for the necessities, even if it means spending less for our present wants. Additionally, by identifying which commodities are considered essential to our lifestyle, we can try to find ways to lower our spending on them, especially if their prices have increased.
This data science project is far from perfect. In fact, we had the following limitations:
Only the inflation rate was analyzed, which may be harder to understand and conceptualize compared to actual prices of goods.
Only the significant commodity groups were modeled due to time constraints. Furthermore, only one model was created per significant commodity group and was not extensively refined to generate more accurate predictions.
The data was only analyzed numerically (through graphs and numerical analysis) but not contextually. As such, this data project only shows the changes in inflation but does not explain why those changes occurred or give context behind each data point.
From these limitations, there are some things we recommend for future research projects:
Try to spend more time creating a more accurate model. Not a perfect machine learning model but a model that can be continuously refined through multiple trials and model assessments.
Instead of trying to predict inflation rate, future research could focus on identifying the factors of inflation rate such as why there was a sudden increase in the inflation rate of Alcoholic Beverages and Tobacco in 2008, etc. This is to raise awareness of how social, political, or economic events affect inflation.
Future projects could also create different prediction models that predict inflation in different ways by considering different pros and cons - they could, for example, sacrifice the accuracy of our model to allow for more randomness in the predictions. Different models have different pros and cons.
Thoughts? We'd Love to Hear from You!