Plaid Sandbox Dashboard

Fintech adoption has surged over the past 18 months as the pandemic has presented new financial challenges such as the closure of physical bank branches and forced business – and much of everyday life – online. Plaid, the San Francisco-based fintech, has been uniquely positioned amidst this growth because its services enable users to easily and securely connect their financial accounts to a wide range of leading fintech apps including payment processors (Stripe, Square), investment platforms (Betterment) and credit card companies (Petal).

Plaid allows any user to easily utilize their API to access either their own financial data or sample “sandbox” data that is generated by Plaid. I decided to delve into the past year of sandbox data by extracting it from their API endpoints and exploring key financial information including their assets/liabilities as well as monthly and annual transaction patterns through a Tableau dashboard.

I accessed the Plaid API by using the Postman API platform and Plaid’s Postman Collection which is easily accessible through their GitHub. Before beginning, I created an account on the Plaid Dashboard site that allowed me to access my unique API keys which I then copied into the Postman app in order to validate my API requests. I was then able to send requests to the Plaid API and receive responses such as this one retrieving transaction data from September 20th, 2020 until September 20th, 2021. I decided to focus on two specific API product endpoints, Balance and Transactions, because they provided compelling data about both short-term financial patterns such as daily transaction totals and monthly credit card balance as well as longer-term financial health such as year-to-date transaction totals by merchant category along with mortgage and student loan balances.

Postman extracts the data from Plaid’s API in nested JSON format so I first needed to transform the data by normalizing it into a flat table using the json_normalize method in Pandas. Next, I cleaned the data by removing unnecessary characters such as brackets and quotes as well as null columns and finally exported the data in CSV format.

I created a Postgres database running within a Docker container and used pgAdmin to create two tables – one for Balance and one for Transactions – with the correct schema matching the Plaid API responses and loaded my CSV files into the appropriate tables. Next, I opened up Tableau Desktop, connected to my Postgres server and began to explore the data and build my dashboard. In order to do so, I needed to perform a few custom SQL queries in Tableau; for example, I needed to split a single category string into both a main category as well as a subcategory for more granular analysis as well as exclude certain types of transactions (e.g. deposits) because I was specifically interested in analyzing transactions with merchants. In addition, I created a calculated field in order to accurately portray negative values (i.e. deficits) while visualizing assets and liabilities.

Conclusion

This sample data is generated by Plaid and, as a result, is somewhat repetitive in its patterns; for example, there is a $500 charge each month to United Airlines and corresponding $500 refund that balances out that subcategory. Additionally, although I requested one year of data from the Plaid API, it only returned six months of data from March to September 2021 which is likely a limitation of the sandbox data. However, I think the most interesting thing about this data actually isn’t the specific data at all but rather the ability for a user to just as easily use the Plaid API to access their own financial accounts, load the results into a database and create a powerful dashboard or fintech app of their own. Combined with the rapid growth of “open banking” – providing a third party such as a fintech app the ability to access financial data from a bank through an API – Plaid is poised to continue powering the fintech boom.