Skip to content

Latest commit

 

History

History
74 lines (59 loc) · 1.72 KB

2014-08-11-financial-trend-analysis.md

File metadata and controls

74 lines (59 loc) · 1.72 KB
layout title date categories
post
Financial Trend Analysis
2014-08-11 08:42:21 -0800
selenium testing

Purpose - to visualize spending habits and identify trends.

Basic process overview

  • gather data files from various locations
  • process data files into one format
  • perhaps [date, transaction amount, name]
  • graph the unified data
  • first pass just graphs all the transactions
  • at a later point we could make improvements to the graph

Bash file to parse a bank export:

{% highlight bash %} #!/bin/bash

INPUT=bank_datafile.csv OLDIFS=$IFS IFS=, [ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; } while read date no description debit credit do if [ ${#credit} -gt 0 ] ;then amount=$credit else amount=$debit fi

echo $date,$( printf "%.2f" $amount ),$description

done < $INPUT IFS=$OLDIFS {% endhighlight %}

Remove the double quotes and multiple spaces:

{% highlight bash %} $ sed -i.bak 's/"//g' pre_parsed_datafile.csv $ sed -i.bak ’s/ / /g’ pre_parsed_datafile.csv {% endhighlight %}

Read the parsed file and print data:

{% highlight python %} import csv import datetime import time

with open('cleaned_transaction.data', newline='') as f: reader = csv.reader(f) for row in reader: date_str = row[0]

    date_num = datetime.datetime.strptime(date_str, "%m/%d/%y")
    seconds_since_epoch = time.mktime(date_num.timetuple()) * 1000
    print(seconds_since_epoch,",", row[1])

{% endhighlight %}

Plot arrays of data:

{% highlight python %} import matplotlib.pyplot as plt x = [1378191600000.0 ,1378191600000.0 ,1378105200000.0 ] y = [-9.89,-10.48,-5.69] plt.plot(x,y,'ro') plt.show() {% endhighlight %}

Financial analysis