Interactive Web Application Big Data Project
Background: Real Estate Property taxes is an large source of local government revenue.
However, from real world data, we can find inequity in taxes around
the country. In simpler terms, low-value properties
usually have higher property tax rate.
Project description: I built an interactive property tax map web application hosted on AWS EC2
to facilitate graphic analysis. This web application visually shows the geographic distribution of the
assessment/tax rate on the US map interactively. In this application, we can view the data all the way
from State level, down to county and each census tract level. This gives policy makers a clear trajectory
to adjust their respective local policies, so as to lower Real Estate tax burdens for lower-income residents.
Tools and Frameworks used: Hadoop, Apache Spark, Hive, HBase, Kafka, AWS EMR, EC2, S3, CodeDeploy, Java, Javascript, HTML, Scala, HQL.
Deep Learning & Parallelism Project
Published in MLSys - GNNSys 2021 [ArXiv]
Background: Graph Neural Networks (GNNs) are a fascinating upcoming deep learning model
designed for problems on irregular graph-structured datasets. Most real life data are better represented
as high dimensional graphs, for example, expansive social networks, protein and drug structures, etc.
Graph learning has also demonstrated successful applications in domains such as
drug design, natural language processing, recommender systems, etc. Unfortunately,
training GNNs is much slower than traditional deep neural networks.
Project Description: To make GNN training and inference more efficient when working
with multiple GPUs,
We propose using pipeline parallelism. This can be done both on the neural network layer level
and the data level. We implemented parallelized GNN models and analyzed their
performance compared to their unparallelized counterparts when trained on multiple
GPUs.
Tools and Frameworks: PyTorch Geometric, Deep Graph Library, GPipe,
Graph Neural Networks, Deep Learning, Google Colaboratory,
Argonne National Laboratory Servers.
Quantitative Trading Bot Project
Background: This is a personal project where I designed an algorithmic trading tool to allow me
to do automated financial investments without monitoring the market manually.
Project Description: I created an auto-trading bot that monitors and retrieves real time market data, uses
efficient algorithmic trading strategies and does automatic trading. So far, I've been able to
maintain a stable daily profit of at least 1% using the tool. I plan to add more advanced
algorithms and strategies to improve the profits.
Tools and Frameworks: Javascript, Python, HTML, real time financial data APIs, trading libraries.
Inequity Municipal Finance Analysis Project
Background: This project is part of the research done by our team at the Center for Municipal Finance
in UChicago Harris School of Public Policy.
Project Description: I generated graphics and county/city analysis report,
designed novel algorithms for duplicate detection and pattern matching,
and implemented Python and R scripts for publishing in a book and for future replications.
Tools and Frameworks: R Markdown, WordPress, Postgresql, Stata, HTML, Box
Background: COVID-19 lock down polices are highly necessary, but have not been satisfactory in general.
It's not unreasonable to wonder if we can use technology to help policy makers
to make better policies.
Project Description: I analyzed location, weather and mobility data from Google,
Twitter, NOAA and US Census Bureau; and built Machine Learning models on top of them to understand
the impact of COVID-19 on community mobility around the US. I used these models to predict future
mobility, glean insights and offer solutions for optimizing lock down policies.
Tools and Frameworks: Scikit-Learn, GeoPandas, Folium, Seaborn, Matplotlib,
Google Maps, TensorFlow, Git
Databases, Auto Script, and other Projects
Philadephia Tax Delinquency project: I created an interactive map for searching
and matching the tax agency closest to delinquent real estate properties.
I also auto scripted notification emails with specific
delinquency details to the taxpayer.
Food Inspection Web Service: I constructed and maintained a live database along with a
web application for Food Inspection
services, which used unique algorithms to find best matchings for inspection records.
Tools and Frameworks: Python, Django, Postgresql, bottle, pyscopg2, Jupyter Notebooks.